WorldWideScience

Sample records for genomic structure polymorphism

  1. Characterization of porcine ENO3: genomic and cDNA structure, polymorphism and expression

    Directory of Open Access Journals (Sweden)

    Xiong Yuanzhu

    2008-09-01

    Full Text Available Abstract In this study, a full-length cDNA of the porcine ENO3 gene encoding a 434 amino acid protein was isolated. It contains 12 exons over approximately 5.4 kb. Differential splicing in the 5'-untranslated sequence generates two forms of mRNA that differ from each other in the presence or absence of a 142-nucleotide fragment. Expression analysis showed that transcript 1 of ENO3 is highly expressed in liver and lung, while transcript 2 is highly expressed in skeletal muscle and heart. We provide the first evidence that in skeletal muscle expression of ENO3 is different between Yorkshire and Meishan pig breeds. Furthermore, real-time polymerase chain reaction revealed that, in Yorkshire pigs, skeletal muscle expression of transcript 1 is identical at postnatal day-1 and at other stages while that of transcript 2 is higher. Moreover, expression of transcript 1 is lower in skeletal muscle and all other tissue samples than that of transcript 2, with the exception of liver and kidney. Statistical analysis showed the existence of a polymorphism in the ENO3 gene between Chinese indigenous and introduced commercial western pig breeds and that it is associated with fat percentage, average backfat thickness, meat marbling and intramuscular fat in two different populations.

  2. Genomic variation and population structure detected by single nucleotide polymorphism arrays in Corriedale, Merino and Creole sheep

    Directory of Open Access Journals (Sweden)

    Andrés N Grasso

    2014-06-01

    Full Text Available The aim of this study was to investigate the genetic diversity within and among three breeds of sheep: Corriedale, Merino and Creole. Sheep from the three breeds (Merino n = 110, Corriedale n = 108 and Creole n = 10 were genotyped using the Illumina Ovine SNP50 beadchip®. Genetic diversity was evaluated by comparing the minor allele frequency (MAF among breeds. Population structure and genetic differentiation were assessed using STRUCTURE software, principal component analysis (PCA and fixation index (F ST. Fixed markers (MAF = 0 that were different among breeds were identified as specific breed markers. Using a subset of 18,181 single nucleotide polymorphisms (SNPs, PCA and STUCTURE analysis were able to explain population stratification within breeds. Merino and Corriedale divergent lines showed high levels of polymorphism (89.4% and 86% of polymorphic SNPs, respectively and moderate genetic differentiation (F ST = 0.08 between them. In contrast, Creole had only 69% polymorphic SNPs and showed greater genetic differentiation from the other two breeds (F ST = 0.17 for both breeds. Hence, a subset of molecular markers present in the OvineSNP50 is informative enough for breed assignment and population structure analysis of commercial and Creole breeds.

  3. Structure of graphane polymorphs

    Science.gov (United States)

    Belenkova, T. E.; Greshyakov, V. A.; Chernov, V. M.; Belenkov, E. A.

    2017-11-01

    Calculations of the structure and electronic properties for five structural variations of graphane were performed within the framework of density functional theory (DFT) with generalized gradient approximations (GGA). The electron densities of states and band structure of graphene crystals have been calculated. It has been established that the band gap for graphane polymorphs varies from 5.50 eV to 5.65 eV. Sublimation energy of graphane layers with different structure was varying from 11.33 to 11.48 eV per C-H molecular group.

  4. Genome-wide Single Nucleotide Polymorphism Analyses Reveal Genetic Diversity and Structure of Wild and Domestic Cattle in Bangladesh

    Directory of Open Access Journals (Sweden)

    Md. Rasel Uzzaman

    2014-10-01

    Full Text Available In spite of variation in coat color, size, and production traits among indigenous Bangladeshi cattle populations, genetic differences among most of the populations have not been investigated or exploited. In this study, we used a high-density bovine single nucleotide polymorphism (SNP 80K Bead Chip derived from Bos indicus breeds to assess genetic diversity and population structure of 2 Bangladeshi zebu cattle populations (red Chittagong, n = 28 and non-descript deshi, n = 28 and a semi-domesticated population (gayal, n = 17. Overall, 95% and 58% of the total SNPs (69,804 showed polymorphisms in the zebu and gayal populations, respectively. Similarly, the average minor allele frequency value was as high 0.29 in zebu and as low as 0.09 in gayal. The mean expected heterozygosity varied from 0.42±0.14 in zebu to 0.148±0.14 in gayal with significant heterozygosity deficiency of 0.06 (FIS in the latter. Coancestry estimations revealed that the two zebu populations are weakly differentiated, with over 99% of the total genetic variation retained within populations and less than 1% accounted for between populations. Conversely, strong genetic differentiation (FST = 0.33 was observed between zebu and gayal populations. Results of population structure and principal component analyses suggest that gayal is distinct from Bos indicus and that the two zebu populations were weakly structured. This study provides basic information about the genetic diversity and structure of Bangladeshi cattle and the semi-domesticated gayal population that can be used for future appraisal of breed utilization and management strategies.

  5. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata

    Directory of Open Access Journals (Sweden)

    Meghann K. Devlin-Durante

    2017-11-01

    Full Text Available The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata, to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.

  6. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata.

    Science.gov (United States)

    Devlin-Durante, Meghann K; Baums, Iliana B

    2017-01-01

    The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata , to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.

  7. Review Single nucleotide polymorphism in genome-wide ...

    African Journals Online (AJOL)

    Genome-wide patterns of variation across individuals provide most powerful source of data for uncovering the history of migration, expansion, and adaptation of the human population. The arrival of new technologies that type more than millions of the single nucleotide polymorphisms (SNPs) in a single experiment has ...

  8. Templated sequence insertion polymorphisms in the human genome

    Science.gov (United States)

    Onozawa, Masahiro; Aplan, Peter

    2016-11-01

    Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.

  9. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  10. Genome-wide patterns of nucleotide polymorphism in domesticated rice

    DEFF Research Database (Denmark)

    Caicedo, Ana L; Williamson, Scott H; Hernandez, Ryan D

    2007-01-01

    Domesticated Asian rice (Oryza sativa) is one of the oldest domesticated crop species in the world, having fed more people than any other plant in human history. We report the patterns of DNA sequence variation in rice and its wild ancestor, O. rufipogon, across 111 randomly chosen gene fragments......, and use these to infer the evolutionary dynamics that led to the origins of rice. There is a genome-wide excess of high-frequency derived single nucleotide polymorphisms (SNPs) in O. sativa varieties, a pattern that has not been reported for other crop species. We developed several alternative models...... the dominant demographic model for domesticated species, cannot explain the derived nucleotide polymorphism site frequency spectrum in rice. Instead, a bottleneck model that incorporates selective sweeps, or a more complex demographic model that includes subdivision and gene flow, are more plausible...

  11. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  12. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    BACKGROUND: Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point...... mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...

  13. Long tandem repeats as a form of genomic copy number variation: structure and length polymorphism of a chromosome 5p repeat in control and schizophrenia populations

    Science.gov (United States)

    Bruce, Heather A.; Sachs, Nancy A.; Rudnicki, Dobrila D.; Lin, Stephanie G.; Willour, Virginia L.; Cowell, John K.; Conroy, Jeffrey; McQuaid, Devin E.; Rossi, Michael; Gaile, Daniel P; Nowak, Norma J.; Holmes, Susan E.; Sklar, Pamela; Ross, Christopher A.; DeLisi, Lynn E.; Margolis, Russell L.

    2016-01-01

    Objectives Genomic copy number variations (CNVs) are a major form of variation in the human genome and play an etiologic role in several neuropsychiatric diseases. Tandem repeats, particularly with long (> 50bp) repeat units, are a relatively common yet underexplored type of CNV that may significantly contribute to human genomic variation and disease risk. We therefore performed a pilot experiment to explore the potential role of long tandem repeats as risk factors in psychiatric disorders. Methods A bacterial artificial chromosome (BAC)-based array comparative genomic hybridization (aCGH) platform was used to examine CNVs in genomic DNA from 34 probands with schizophrenia or schizoaffective disorder. Results The aCGH screen detected an apparent deletion on 5p15.1 in two probands, caused by the presence in each proband of two low copy number (short) alleles of a tandem repeat that ranges in length from 50 3.4 kb units in the population examined. Short alleles partially segregate with schizophrenia in a small number of families, though linkage was not significant. An association study showed no significant difference in repeat length between 406 schizophrenia cases and 392 controls. Conclusion Though we did not demonstrate a relationship between the 5p15.1 repeat and schizophrenia, our results illustrate that long tandem repeats represent an intriguing type of genetic variation that have not been previously studied in connection with psychiatric illness. aCGH can detect a small subset of these repeats, but systematic investigation will require the development of specific arrays and improved analytic methods. PMID:19672138

  14. The polydeoxyadenylate tract of Alu repetitive elements is polymorphic in the human genome

    International Nuclear Information System (INIS)

    Economou, E.P.; Bergen, A.W.; Warren, A.C.; Antonarakis, S.E.

    1990-01-01

    To identify DNA polymorphisms that are abundant in the human genome and are detectable by polymerase chain reaction amplification of genomic DNA, the authors hypothesize that the polydeoxyadenylate tract of the Alu family of repetitive elements is polymorphic among human chromosomes. Analysis of the 3' ends of three specific Alu sequences showed two occurrences, one in the adenosine deaminase gene and other in the β-globin pseudogene, were polymorphic. This novel class of polymorphism, termed AluVpA [Alu variable poly(A)] may represent one of the most useful and informative group of DNA markers in the human genome

  15. Genomic lineages of Rhizobium etli revealed by the extent of nucleotide polymorphisms and low recombination

    Directory of Open Access Journals (Sweden)

    González Víctor

    2011-10-01

    Full Text Available Abstract Background Most of the DNA variations found in bacterial species are in the form of single nucleotide polymorphisms (SNPs, but there is some debate regarding how much of this variation comes from mutation versus recombination. The nitrogen-fixing symbiotic bacteria Rhizobium etli is highly variable in both genomic structure and gene content. However, no previous report has provided a detailed genomic analysis of this variation at nucleotide level or the role of recombination in generating diversity in this bacterium. Here, we compared draft genomic sequences versus complete genomic sequences to obtain reliable measures of genetic diversity and then estimated the role of recombination in the generation of genomic diversity among Rhizobium etli. Results We identified high levels of DNA polymorphism in R. etli, and found that there was an average divergence of 4% to 6% among the tested strain pairs. DNA recombination events were estimated to affect 3% to 10% of the genomic sample analyzed. In most instances, the nucleotide diversity (π was greater in DNA segments with recombinant events than in non-recombinant segments. However, this degree of recombination was not sufficiently large to disrupt the congruence of the phylogenetic trees, and further evaluation of recombination in strains quartets indicated that the recombination levels in this species are proportionally low. Conclusion Our data suggest that R. etli is a species composed of separated lineages with low homologous recombination among the strains. Horizontal gene transfer, particularly via the symbiotic plasmid characteristic of this species, seems to play an important role in diversity but the lineages maintain their evolutionary cohesiveness.

  16. Survey and analysis of crystal polymorphism in organic structures

    Directory of Open Access Journals (Sweden)

    Kortney Kersten

    2018-03-01

    Full Text Available With the intention of producing the most comprehensive treatment of the prevalence of crystal polymorphism among structurally characterized materials, all polymorphic compounds flagged as such within the Cambridge Structural Database (CSD are analysed and a list of crystallographically characterized organic polymorphic compounds is assembled. Classifying these structures into subclasses of anhydrates, salts, hydrates, non-hydrated solvates and cocrystals reveals that there are significant variations in polymorphism prevalence as a function of crystal type, a fact which has not previously been recognized in the literature. It is also shown that, as a percentage, polymorphic entries are decreasing temporally within the CSD, with the notable exception of cocrystals, which continue to rise at a rate that is a constant fraction of the overall entries. Some phenomena identified that require additional scrutiny include the relative prevalence of temperature-induced phase transitions among organic salts and the paucity of polymorphism in crystals with three or more chemical components.

  17. Polymorphism and mutation analysis of genomic DNA on cancer

    International Nuclear Information System (INIS)

    Ohta, Tsutomu

    2003-01-01

    DNA repair is a universal process in living cells that maintains the structural integrity of chromosomal DNA molecules in face of damage. A deficiency in DNA damage repair is associated with an increased cancer risk by increasing a mutation frequency of cancer-related genes. Variation in DNA repair capacity may be genetically determined. Therefore, we searched single-nucleotide polymorphisms (SNPs) in major DNA repair genes. This led to the finding of 600 SNPs and mutations including many novel SNPs in Japanese population. Case-control studies to explore the contribution of the SNPs in DNA repair genes to the risk of lung cancer revealed that five SNPs are associated with lung carcinogenesis. One of these SNPs is found in RAD54L gene, which is involved in double-strand DNA repair. We analyzed and reported activities of Rad54L protein with SNP and mutations. (authors)

  18. Genome-Wide Discovery and Information Resource Development of DNA Polymorphisms in Cassava

    Science.gov (United States)

    Yoshida, Takuhiro; Akiyama, Kenji; Ishitani, Manabu; Seki, Motoaki; Shinozaki, Kazuo

    2013-01-01

    Cassava (Manihot esculenta Crantz) is an important crop that provides food security and income generation in many tropical countries, and is known for its adaptability to various environmental conditions. Its draft genome sequence and many expressed sequence tags are now publicly available, allowing the development of cassava polymorphism information. Here, we describe the genome-wide discovery of cassava DNA polymorphisms. Using the alignment of predicted transcribed sequences from the cassava draft genome sequence and ESTs from GenBank, we discovered 10,546 single-nucleotide polymorphisms and 647 insertions and deletions. To facilitate molecular marker development for cassava, we designed 9,316 PCR primer pairs to amplify the genomic region around each DNA polymorphism. Of the discovered SNPs, 62.7% occurred in protein-coding regions. Disease-resistance genes were found to have a significantly higher ratio of nonsynonymous-to-synonymous substitutions. We identified 24 read-through (changes of a stop codon to a coding codon) and 38 premature stop (changes of a coding codon to a stop codon) single-nucleotide polymorphisms, and found that the 5 gene ontology terms in biological process were significantly different in genes with read-through single-nucleotide polymorphisms compared with all cassava genes. All data on the discovered DNA polymorphisms were organized into the Cassava Online Archive database, which is available at http://cassava.psc.riken.jp/. PMID:24040164

  19. 2004 Structural, Function and Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  20. Detection of genome-wide polymorphisms in the AT-rich Plasmodium falciparum genome using a high-density microarray

    Directory of Open Access Journals (Sweden)

    Huyen Yentram

    2008-08-01

    Full Text Available Abstract Background Genetic mapping is a powerful method to identify mutations that cause drug resistance and other phenotypic changes in the human malaria parasite Plasmodium falciparum. For efficient mapping of a target gene, it is often necessary to genotype a large number of polymorphic markers. Currently, a community effort is underway to collect single nucleotide polymorphisms (SNP from the parasite genome. Here we evaluate polymorphism detection accuracy of a high-density 'tiling' microarray with 2.56 million probes by comparing single feature polymorphisms (SFP calls from the microarray with known SNP among parasite isolates. Results We found that probe GC content, SNP position in a probe, probe coverage, and signal ratio cutoff values were important factors for accurate detection of SFP in the parasite genome. We established a set of SFP calling parameters that could predict mSFP (SFP called by multiple overlapping probes with high accuracy (≥ 94% and identified 121,087 mSFP genome-wide from five parasite isolates including 40,354 unique mSFP (excluding those from multi-gene families and ~18,000 new mSFP, producing a genetic map with an average of one unique mSFP per 570 bp. Genomic copy number variation (CNV among the parasites was also cataloged and compared. Conclusion A large number of mSFP were discovered from the P. falciparum genome using a high-density microarray, most of which were in clusters of highly polymorphic genes at chromosome ends. Our method for accurate mSFP detection and the mSFP identified will greatly facilitate large-scale studies of genome variation in the P. falciparum parasite and provide useful resources for mapping important parasite traits.

  1. Prediction of protein-destabilizing polymorphisms by manual curation with protein structure.

    Directory of Open Access Journals (Sweden)

    Craig Alan Gough

    Full Text Available The relationship between sequence polymorphisms and human disease has been studied mostly in terms of effects of single nucleotide polymorphisms (SNPs leading to single amino acid substitutions that change protein structure and function. However, less attention has been paid to more drastic sequence polymorphisms which cause premature termination of a protein's sequence or large changes, insertions, or deletions in the sequence. We have analyzed a large set (n = 512 of insertions and deletions (indels and single nucleotide polymorphisms causing premature termination of translation in disease-related genes. Prediction of protein-destabilization effects was performed by graphical presentation of the locations of polymorphisms in the protein structure, using the Genomes TO Protein (GTOP database, and manual annotation with a set of specific criteria. Protein-destabilization was predicted for 44.4% of the nonsense SNPs, 32.4% of the frameshifting indels, and 9.1% of the non-frameshifting indels. A prediction of nonsense-mediated decay allowed to infer which truncated proteins would actually be translated as defective proteins. These cases included the proteins linked to diseases inherited dominantly, suggesting a relation between these diseases and toxic aggregation. Our approach would be useful in identifying potentially aggregation-inducing polymorphisms that may have pathological effects.

  2. Utilising polymorphisms to achieve allele-specific genome editing in zebrafish

    Directory of Open Access Journals (Sweden)

    Samuel J. Capon

    2017-01-01

    Full Text Available The advent of genome editing has significantly altered genetic research, including research using the zebrafish model. To better understand the selectivity of the commonly used CRISPR/Cas9 system, we investigated single base pair mismatches in target sites and examined how they affect genome editing in the zebrafish model. Using two different zebrafish strains that have been deep sequenced, CRISPR/Cas9 target sites containing polymorphisms between the two strains were identified. These strains were crossed (creating heterozygotes at polymorphic sites and CRISPR/Cas9 complexes that perfectly complement one strain injected. Sequencing of targeted sites showed biased, allele-specific editing for the perfectly complementary sequence in the majority of cases (14/19. To test utility, we examined whether phenotypes generated by F0 injection could be internally controlled with such polymorphisms. Targeting of genes bmp7a and chordin showed reduction in the frequency of phenotypes in injected ‘heterozygotes’ compared with injecting the strain with perfect complementarity. Next, injecting CRISPR/Cas9 complexes targeting two separate sites created deletions, but deletions were biased to selected chromosomes when one CRISPR/Cas9 target contained a polymorphism. Finally, integration of loxP sequences occurred preferentially in alleles with perfect complementarity. These experiments demonstrate that single nucleotide polymorphisms (SNPs present throughout the genome can be utilised to increase the efficiency of in cis genome editing using CRISPR/Cas9 in the zebrafish model.

  3. A penalized linear mixed model for genomic prediction using pedigree structures.

    Science.gov (United States)

    Yang, Can; Li, Cong; Chen, Mengjie; Chen, Xiaowei; Hou, Lin; Zhao, Hongyu

    2014-01-01

    Genetic Analysis Workshop 18 provided a platform for evaluating genomic prediction power based on single-nucleotide polymorphisms from single-nucleotide polymorphism array data and sequencing data. Also, Genetic Analysis Workshop 18 provided a diverse pedigree structure to be explored in prediction. In this study, we attempted to combine pedigree information with single-nucleotide polymorphism data to predict systolic blood pressure. Our results suggested that the prediction power based on pedigree information only could be unsatisfactory. Using additional information such as single-nucleotide polymorphism genotypes would improve prediction accuracy. In particular, the improvement can be significant when there exist a few single-nucleotide polymorphisms with relatively larger effect sizes. We also compared the prediction performance based on genome-wide association study data (ie, common variants) and sequencing data (ie, common variants plus low-frequency variants). The experimental result showed that inclusion of low frequency variants could not lead to improvement of prediction accuracy.

  4. Intra-strain polymorphisms are detected but no genomic alteration is found in cloned mice

    International Nuclear Information System (INIS)

    Gotoh, Koshichi; Inoue, Kimiko; Ogura, Atsuo; Oishi, Michio

    2006-01-01

    In-gel competitive reassociation (IGCR) is a method for differential subtraction of polymorphic (RFLP) DNA fragments between two DNA samples of interest without probes or specific sequence information. Here, we applied the IGCR procedure to two cloned mice derived from an F1 hybrid of the C57BL/6Cr and DBA/2 strains, in order to investigate the possibility of genomic alteration in the cloned mouse genomes. Each of the five of the genomic alterations we detected between the two cloned mice corresponded to the 'intra-strain' polymorphisms in the C57BL/6Cr and DBA/2 mouse strains. Our result suggests that no severe aberration of genome sequences occurs due to somatic cell nuclear transfer

  5. Extensive variation in the density and distribution of DNA polymorphism in sorghum genomes.

    Directory of Open Access Journals (Sweden)

    Joseph Evans

    Full Text Available Sorghum genotypes currently used for grain production in the United States were developed from African landraces that were imported starting in the mid-to-late 19(th century. Farmers and plant breeders selected genotypes for grain production with reduced plant height, early flowering, increased grain yield, adaptation to drought, and improved resistance to lodging, diseases and pests. DNA polymorphisms that distinguish three historically important grain sorghum genotypes, BTx623, BTx642 and Tx7000, were characterized by genome sequencing, genotyping by sequencing, genetic mapping, and pedigree-based haplotype analysis. The distribution and density of DNA polymorphisms in the sequenced genomes varied widely, in part because the lines were derived through breeding and selection from diverse Kafir, Durra, and Caudatum race accessions. Genomic DNA spanning dw1 (SBI-09 and dw3 (SBI-07 had identical haplotypes due to selection for reduced height. Lower SNP density in genes located in pericentromeric regions compared with genes located in euchromatic regions is consistent with background selection in these regions of low recombination. SNP density was higher in euchromatic DNA and varied >100-fold in contiguous intervals that spanned up to 300 Kbp. The localized variation in DNA polymorphism density occurred throughout euchromatic regions where recombination is elevated, however, polymorphism density was not correlated with gene density or DNA methylation. Overall, sorghum chromosomes contain distal euchromatic regions characterized by extensive, localized variation in DNA polymorphism density, and large pericentromeric regions of low gene density, diversity, and recombination.

  6. Human Xq28 Inversion Polymorphism: From Sex Linkage to Genomics--A Genetic Mother Lode

    Science.gov (United States)

    Kirby, Cait S.; Kolber, Natalie; Salih Almohaidi, Asmaa M.; Bierwert, Lou Ann; Saunders, Lori; Williams, Steven; Merritt, Robert

    2016-01-01

    An inversion polymorphism of the filamin and emerin genes at the tip of the long arm of the human X-chromosome serves as the basis of an investigative laboratory in which students learn something new about their own genomes. Long, nearly identical inverted repeats flanking the filamin and emerin genes illustrate how repetitive elements can lead to…

  7. Structural origin of polymorphism of Alzheimer's amyloid β-fibrils.

    Science.gov (United States)

    Agopian, Audrey; Guo, Zhefeng

    2012-10-01

    Formation of senile plaques containing amyloid fibrils of Aβ (amyloid β-peptide) is a pathological hallmark of Alzheimer's disease. Unlike globular proteins, which fold into unique structures, the fibrils of Aβ and other amyloid proteins often contain multiple polymorphs. Polymorphism of amyloid fibrils leads to different toxicity in amyloid diseases and may be the basis for prion strains, but the structural origin for fibril polymorphism is still elusive. In the present study we investigate the structural origin of two major fibril polymorphs of Aβ40: an untwisted polymorph formed under agitated conditions and a twisted polymorph formed under quiescent conditions. Using electron paramagnetic resonance spectroscopy, we studied the inter-strand side-chain interactions at 14 spin-labelled positions in the Aβ40 sequence. The results of the present study show that the agitated fibrils have stronger inter-strand spin-spin interactions at most of the residue positions investigated. The two hydrophobic regions at residues 17-20 and 31-36 have the strongest interactions in agitated fibrils. Distance estimates on the basis of the spin exchange frequencies suggest that inter-strand distances at residues 17, 20, 32, 34 and 36 in agitated fibrils are approximately 0.2 Å (1 Å=0.1 nm) closer than in quiescent fibrils. We propose that the strength of inter-strand side-chain interactions determines the degree of β-sheet twist, which then leads to the different association patterns between different cross β-units and thus distinct fibril morphologies. Therefore the inter-strand side-chain interaction may be a structural origin for fibril polymorphism in Aβ and other amyloid proteins.

  8. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Directory of Open Access Journals (Sweden)

    Gopala Krishnan S

    Full Text Available BACKGROUND: Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. RESULTS: We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts. Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. CONCLUSIONS: Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  9. Comparative genome-wide polymorphic microsatellite markers in Antarctic penguins through next generation sequencing

    Science.gov (United States)

    Vianna, Juliana A.; Noll, Daly; Mura-Jornet, Isidora; Valenzuela-Guerra, Paulina; González-Acuña, Daniel; Navarro, Cristell; Loyola, David E.; Dantas, Gisele P. M.

    2017-01-01

    Abstract Microsatellites are valuable molecular markers for evolutionary and ecological studies. Next generation sequencing is responsible for the increasing number of microsatellites for non-model species. Penguins of the Pygoscelis genus are comprised of three species: Adélie (P. adeliae), Chinstrap (P. antarcticus) and Gentoo penguin (P. papua), all distributed around Antarctica and the sub-Antarctic. The species have been affected differently by climate change, and the use of microsatellite markers will be crucial to monitor population dynamics. We characterized a large set of genome-wide microsatellites and evaluated polymorphisms in all three species. SOLiD reads were generated from the libraries of each species, identifying a large amount of microsatellite loci: 33,677, 35,265 and 42,057 for P. adeliae, P. antarcticus and P. papua, respectively. A large number of dinucleotide (66,139), trinucleotide (29,490) and tetranucleotide (11,849) microsatellites are described. Microsatellite abundance, diversity and orthology were characterized in penguin genomes. We evaluated polymorphisms in 170 tetranucleotide loci, obtaining 34 polymorphic loci in at least one species and 15 polymorphic loci in all three species, which allow to perform comparative studies. Polymorphic markers presented here enable a number of ecological, population, individual identification, parentage and evolutionary studies of Pygoscelis, with potential use in other penguin species. PMID:28898354

  10. Single nucleotide polymorphism in genome-wide association of ...

    African Journals Online (AJOL)

    Mohd Fareed

    2012-09-25

    Sep 25, 2012 ... The arrival of new technologies that type more than millions of the single nucleotide polymor- phisms (SNPs) in .... Rapid advances in technology ...... carriers. Neuron. 2007;54:713–20. [97] Baum AE, Akula N, Cabanero M, et al. A genome-wide association study implicates diacylglycerol kinase eta (DGKH).

  11. Evaluation of multiple approaches to identify genome-wide polymorphisms in closely related genotypes of sweet cherry (Prunus avium L.

    Directory of Open Access Journals (Sweden)

    Seanna Hewitt

    Full Text Available Identification of genetic polymorphisms and subsequent development of molecular markers is important for marker assisted breeding of superior cultivars of economically important species. Sweet cherry (Prunus avium L. is an economically important non-climacteric tree fruit crop in the Rosaceae family and has undergone a genetic bottleneck due to breeding, resulting in limited genetic diversity in the germplasm that is utilized for breeding new cultivars. Therefore, it is critical to recognize the best platforms for identifying genome-wide polymorphisms that can help identify, and consequently preserve, the diversity in a genetically constrained species. For the identification of polymorphisms in five closely related genotypes of sweet cherry, a gel-based approach (TRAP, reduced representation sequencing (TRAPseq, a 6k cherry SNParray, and whole genome sequencing (WGS approaches were evaluated in the identification of genome-wide polymorphisms in sweet cherry cultivars. All platforms facilitated detection of polymorphisms among the genotypes with variable efficiency. In assessing multiple SNP detection platforms, this study has demonstrated that a combination of appropriate approaches is necessary for efficient polymorphism identification, especially between closely related cultivars of a species. The information generated in this study provides a valuable resource for future genetic and genomic studies in sweet cherry, and the insights gained from the evaluation of multiple approaches can be utilized for other closely related species with limited genetic diversity in the breeding germplasm. Keywords: Polymorphisms, Prunus avium, Next-generation sequencing, Target region amplification polymorphism (TRAP, Genetic diversity, SNParray, Reduced representation sequencing, Whole genome sequencing (WGS

  12. Population structure of Salmonella investigated by amplified fragment length polymorphism

    DEFF Research Database (Denmark)

    Torpdahl, M.; Ahrens, Peter

    2004-01-01

    Aims: This study was undertaken to investigate the usefulness of amplified fragment length polymorphism (AFLP) in determining the population structure of Salmonella. Methods and Results: A total of 89 strains were subjected to AFLP analysis using the enzymes BglII and BspDI, a combination...

  13. DivStat: a user-friendly tool for single nucleotide polymorphism analysis of genomic diversity.

    Directory of Open Access Journals (Sweden)

    Inês Soares

    Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.

  14. CAG-encoded polyglutamine length polymorphism in the human genome

    Directory of Open Access Journals (Sweden)

    Hayden Michael R

    2007-05-01

    Full Text Available Abstract Background Expansion of polyglutamine-encoding CAG trinucleotide repeats has been identified as the pathogenic mutation in nine different genes associated with neurodegenerative disorders. The majority of individuals clinically diagnosed with spinocerebellar ataxia do not have mutations within known disease genes, and it is likely that additional ataxias or Huntington disease-like disorders will be found to be caused by this common mutational mechanism. We set out to determine the length distributions of CAG-polyglutamine tracts for the entire human genome in a set of healthy individuals in order to characterize the nature of polyglutamine repeat length variation across the human genome, to establish the background against which pathogenic repeat expansions can be detected, and to prioritize candidate genes for repeat expansion disorders. Results We found that repeats, including those in known disease genes, have unique distributions of glutamine tract lengths, as measured by fragment analysis of PCR-amplified repeat regions. This emphasizes the need to characterize each distribution and avoid making generalizations between loci. The best predictors of known disease genes were occurrence of a long CAG-tract uninterrupted by CAA codons in their reference genome sequence, and high glutamine tract length variance in the normal population. We used these parameters to identify eight priority candidate genes for polyglutamine expansion disorders. Twelve CAG-polyglutamine repeats were invariant and these can likely be excluded as candidates. We outline some confusion in the literature about this type of data, difficulties in comparing such data between publications, and its application to studies of disease prevalence in different populations. Analysis of Gene Ontology-based functions of CAG-polyglutamine-containing genes provided a visual framework for interpretation of these genes' functions. All nine known disease genes were involved in DNA

  15. The Fusarium Graminearum Genome Reveals a Link Between Localized Polymorphism and Pathogen Specialization

    Energy Technology Data Exchange (ETDEWEB)

    Cuomo, Christina A.; Guldener, Ulrich; Xu, Jin Rong; Trail, Frances; Turgeon, Barbara G.; Di Pietro, Antonio; Walton, Johnathan D.; Ma, Li Jun; Baker, Scott E.; Rep, Martijn; Adam, Gerhard; Antoniw, John; Baldwin, Thomas; Calvo, Sarah; Chang, Yueh Long; DeCaprio, David; Gale, Liane R.; Gnerre, Sante; Goswami, Rubella S.; Hammond-Kossack, Kim; Harris, Linda J.; Hilburn, Karen; Kennell, John C.; Kroken, Scott; Magnuson, Jon K.; Mannhaupt, Gertrud; Mauceli, Evan; Mewes, Hans Werner; Mitterbauer, Rudolf; Muehlbauer, Gary; Munsterkotter, Martin; Nelson, David; O' Donnell, Kerry; Ouellet, Therese; Qi, Weihong; Quesneville, Hadi; Roncero, M. Isabel; Seong, Kye Yong; Tetko, Igor V.; Urban, Martin; Waalwijk, Cees; Ward, Todd J.; Yao, Jiqiang; Birren, Bruce W.; Kistler, H. Corby

    2007-09-07

    We sequenced and annotated the genome of the filamentous fungus Fusarium graminearum, a major pathogen of cultivated cereals. Very few repetitive sequences were detected, and the process of repeat-induced point mutation, in which duplicated sequences are subject to extensive mutation, may partially account for the reduced repeat content and apparent low number of paralogous (ancestrally duplicated) genes. A second strain of F. graminearum contained more than 10,000 single-nucleotide polymorphisms, which were frequently located near telomeres and within other discrete chromosomal segments. Many highly polymorphic regions contained sets of genes implicated in plant-fungus interactions and were unusually divergent, with higher rates of recombination. These regions of genome innovation may result from selection due to interactions of F. graminearum with its plant hosts.

  16. Genomic diversity among Danish field strains of Mycoplasma hyosynoviae assessed by amplified fragment length polymorphism analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, Niels F.; Nielsen, Elisabeth O.

    2002-01-01

    Genomic diversity among strains of Mycoplasma hyosynoviae isolated in Denmark was assessed by using amplified fragment length polymorphism (AFLP) analysis. Ninety-six strains, obtained from different specimens and geographical locations during 30 years and the type strain of M. hyosynoviae S16(T......) were concurrently examined for variance in BglII-MfeI and EcoRI-Csp6I-A AFLP markers. A total of 56 different genomic fingerprints having an overall similarity between 77 and 96% were detected. No correlation between AFLP variability and period of isolation or anatomical site of isolation could...

  17. Genomic polymorphism of Leishmania infantum: a relationship with clinical pleomorphism?

    Science.gov (United States)

    Guerbouj, S; Guizani, I; Speybroeck, N; Le Ray, D; Dujardin, J C

    2001-07-01

    Leishmania infantum is the etiological agent of visceral (VL) and a cutaneous form (CL) of leishmaniasis around the Mediterranean Basin. In order to document the parasite genetic background corresponding to this clinical diversity, chromosome size polymorphism was analysed in 32 French isolates (18 CL and 14 VL) originating from the Cévennes and the Pyrénées Orientales (PO), and corresponding to zymodemes MON-1 and MON-29. Five chromosomes bearing tandemly repeated genes encoding for important antigens (gp63, PSA-2 and K39) or key metabolic functions (mini-exon and rDNA) were studied. Significant size variation (100-270 kbp) was observed for chromosomes bearing mini-exon, PSA-2 and rDNA genes, which involved variation in copy number of corresponding genes. The two other chromosomes showed smaller size-variation and did not involve dosage of gp63 and K39 genes. Chromosomal size showed correlation with geography and clinical origin: (i) chromosome 2 (mini-exon) was found to be significantly smaller in the PO; (ii) chromosomes 12 (PSA-2) and 27 (rDNA) were significantly smaller in the strictly cutaneous MON-29 isolates. Gene rearrangements and their synergistic effects on the phenotypic expression of the parasite are discussed.

  18. A sequence-based survey of the complex structural organization of tumor genomes

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  19. Effects of As2O3 on DNA methylation, genomic instability, and LTR retrotransposon polymorphism in Zea mays.

    Science.gov (United States)

    Erturk, Filiz Aygun; Aydin, Murat; Sigmaz, Burcu; Taspinar, M Sinan; Arslan, Esra; Agar, Guleray; Yagci, Semra

    2015-12-01

    Arsenic is a well-known toxic substance on the living organisms. However, limited efforts have been made to study its DNA methylation, genomic instability, and long terminal repeat (LTR) retrotransposon polymorphism causing properties in different crops. In the present study, effects of As2O3 (arsenic trioxide) on LTR retrotransposon polymorphism and DNA methylation as well as DNA damage in Zea mays seedlings were investigated. The results showed that all of arsenic doses caused a decreasing genomic template stability (GTS) and an increasing Random Amplified Polymorphic DNAs (RAPDs) profile changes (DNA damage). In addition, increasing DNA methylation and LTR retrotransposon polymorphism characterized a model to explain the epigenetically changes in the gene expression were also found. The results of this experiment have clearly shown that arsenic has epigenetic effect as well as its genotoxic effect. Especially, the increasing of polymorphism of some LTR retrotransposon under arsenic stress may be a part of the defense system against the stress.

  20. Redetermined structure of gossypol (P3 polymorph

    Directory of Open Access Journals (Sweden)

    Muhabbat Honkeldieva

    2015-07-01

    Full Text Available An improved crystal structure of the title compound, C30H30O8 (systematic name: 1,1′,6,6′,7,7′-hexahydroxy-5,5′-diisopropyl-3,3′-dimethyl[2,2′-binaphthalene]-8,8′-dicarbaldehyde, was determined based on modern CCD data. Compared to the previous structure [Talipov et al. (1985. Khim. Prirod. Soedin. (Chem. Nat. Prod., 6, 20–24], geometrical precision has been improved (typical C—C bond-distance s.u. = 0.002 Å in the present structure compared to 0.005 Å in the previous structure and the locations of several H atoms have been corrected. The gossypol molecules are in the aldehyde tautomeric form and the dihedral angle between the naphthyl fragments is 80.42 (4°. Four intramolecular O—H...O hydrogen bonds are formed. In the crystal, inversion dimers with graph-set motif R22(20 are formed by pairs of O—H...O hydrogen bonds; another pair of O—H...O hydrogen bonds with the same graph-set motif links the dimers into [001] chains. The packing of such chains in the crystal leads to the formation of channels (diameter = 5–8 Å propagating in the [101] direction. The channels presumably contain highly disordered solvent molecules; their contribution to the scattering was removed with the SQUEEZE [Spek (2015. Acta Cryst. C71, 9–18] routine in PLATON and the stated molecular mass, density etc., do not take them into account.

  1. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  2. Polymorphic integrations of an endogenous gammaretrovirus in the mule deer genome.

    Science.gov (United States)

    Elleder, Daniel; Kim, Oekyung; Padhi, Abinash; Bankert, Jason G; Simeonov, Ivan; Schuster, Stephan C; Wittekindt, Nicola E; Motameny, Susanne; Poss, Mary

    2012-03-01

    Endogenous retroviruses constitute a significant genomic fraction in all mammalian species. Typically they are evolutionarily old and fixed in the host species population. Here we report on a novel endogenous gammaretrovirus (CrERVγ; for cervid endogenous gammaretrovirus) in the mule deer (Odocoileus hemionus) that is insertionally polymorphic among individuals from the same geographical location, suggesting that it has a more recent evolutionary origin. Using PCR-based methods, we identified seven CrERVγ proviruses and demonstrated that they show various levels of insertional polymorphism in mule deer individuals. One CrERVγ provirus was detected in all mule deer sampled but was absent from white-tailed deer, indicating that this virus originally integrated after the split of the two species, which occurred approximately one million years ago. There are, on average, 100 CrERVγ copies in the mule deer genome based on quantitative PCR analysis. A CrERVγ provirus was sequenced and contained intact open reading frames (ORFs) for three virus genes. Transcripts were identified covering the entire provirus. CrERVγ forms a distinct branch of the gammaretrovirus phylogeny, with the closest relatives of CrERVγ being endogenous gammaretroviruses from sheep and pig. We demonstrated that white-tailed deer (Odocoileus virginianus) and elk (Cervus canadensis) DNA contain proviruses that are closely related to mule deer CrERVγ in a conserved region of pol; more distantly related sequences can be identified in the genome of another member of the Cervidae, the muntjac (Muntiacus muntjak). The discovery of a novel transcriptionally active and insertionally polymorphic retrovirus in mammals could provide a useful model system to study the dynamic interaction between the host genome and an invading retrovirus.

  3. Polymorphism of dislocation core structures at the atomic scale.

    Science.gov (United States)

    Wang, Zhongchang; Saito, Mitsuhiro; McKenna, Keith P; Ikuhara, Yuichi

    2014-01-01

    Dislocation defects together with their associated strain fields and segregated impurities are of considerable significance in many areas of materials science. However, their atomic-scale structures have remained extremely challenging to resolve, limiting our understanding of these ubiquitous defects. Here, by developing a complex modelling approach in combination with bicrystal experiments and systematic atomic-resolution imaging, we are now able to pinpoint individual dislocation cores at the atomic scale, leading to the discovery that even simple magnesium oxide can exhibit polymorphism of core structures for a given dislocation species. These polymorphic cores are associated with local variations in strain fields, segregation of defects, and electronic states, adding a new dimension to understanding the properties of dislocations in real materials. The findings advance our fundamental understanding of basic behaviours of dislocations and demonstrate that quantitative prediction and characterization of dislocations in real materials is possible.

  4. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  5. E Unibus Plurum: genomic analysis of an experimentally evolved polymorphism in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Margie A Kinnersley

    2009-11-01

    Full Text Available Microbial populations founded by a single clone and propagated under resource limitation can become polymorphic. We sought to elucidate genetic mechanisms whereby a polymorphism evolved in Escherichia coli under glucose limitation and persisted because of cross-feeding among multiple adaptive clones. Apart from a 29 kb deletion in the dominant clone, no large-scale genomic changes distinguished evolved clones from their common ancestor. Using transcriptional profiling on co-evolved clones cultured separately under glucose-limitation we identified 180 genes significantly altered in expression relative to the common ancestor grown under similar conditions. Ninety of these were similarly expressed in all clones, and many of the genes affected (e.g., mglBAC, mglD, and lamB are in operons coordinately regulated by CRP and/or rpoS. While the remaining significant expression differences were clone-specific, 93% were exhibited by the majority clone, many of which are controlled by global regulators, CRP and CpxR. When transcriptional profiling was performed on adaptive clones cultured together, many expression differences that distinguished the majority clone cultured in isolation were absent, suggesting that CpxR may be activated by overflow metabolites removed by cross-feeding strains in co-culture. Relative to their common ancestor, shared expression differences among adaptive clones were partly attributable to early-arising shared mutations in the trans-acting global regulator, rpoS, and the cis-acting regulator, mglO. Gene expression differences that distinguished clones may in part be explained by mutations in trans-acting regulators malT and glpK, and in cis-acting sequences of acs. In the founder, a cis-regulatory mutation in acs (acetyl CoA synthetase and a structural mutation in glpR (glycerol-3-phosphate repressor likely favored evolution of specialists that thrive on overflow metabolites. Later-arising mutations that led to specialization

  6. Structures and energetics of Ga2O3 polymorphs

    International Nuclear Information System (INIS)

    Yoshioka, S; Hayashi, H; Kuwabara, A; Oba, F; Matsunaga, K; Tanaka, I

    2007-01-01

    First-principles calculations are made for five Ga 2 O 3 polymorphs. The structure of ε-Ga 2 O 3 with the space group Pna 2 1 (No. 33, orthorhombic), which is sometimes called κ-Ga 2 O 3 in the literature, is consistent with experimental reports. The structure of γ-Ga 2 O 3 is optimized within 14 inequivalent configurations of defective spinel structures. Phonon dispersion curves of four polymorphs are obtained. The volume expansivity, bulk modulus, and specific heat at constant volume are computed as a function of temperature within the quasi-harmonic approximation. The Helmholtz free energies of the polymorphs are thus compared. The expansivity shows a relationship of β<ε<α<δ, while β<ε<δ<α for the bulk modulus. The formation free energies have the tendency β<ε<α<δ<γ at low temperatures. With the increase of temperature, the difference in free energy between the β-phase and the ε-phase becomes smaller. Eventually the ε phase becomes more stable at above 1600 K

  7. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  8. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    Science.gov (United States)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  9. Electronic structure and thermodynamics of V2O3 polymorphs.

    Science.gov (United States)

    Wessel, C; Reimann, C; Müller, A; Weber, D; Lerch, M; Ressler, T; Bredow, T; Dronskowski, R

    2012-10-05

    A metastable bixbyite-type polymorph of vanadium sesquioxide, V(2)O(3), has recently been synthesized, and it transforms to the corundum-type phase at temperatures around 550 °C. The possibility of a paramagnetic to canted antiferromagnetic or even spin-glass-like transition has been discussed. Quantum-chemical calculations on the density-functional theory level including explicit electronic correlation confirm the metastability as well as the semiconducting behavior of the material and predict that the bixbyite-type structure is about 0.1 eV less stable than the well-known corundum-type phase. Nonetheless, quasiharmonic phonon calculations manifest that bixbyite-type vanadium sesquioxide is a dynamically stable compound. Other possible V(2)O(3) polymorphs are shown to be even less suitable candidates for the composition V(2)O(3). Copyright © 2012 Wiley Periodicals, Inc.

  10. Detailed analysis of inversions predicted between two human genomes: errors, real polymorphisms, and their origin and population distribution.

    Science.gov (United States)

    Vicente-Salvador, David; Puig, Marta; Gayà-Vidal, Magdalena; Pacheco, Sarai; Giner-Delgado, Carla; Noguera, Isaac; Izquierdo, David; Martínez-Fundichely, Alexander; Ruiz-Herrera, Aurora; Estivill, Xavier; Aguado, Cristina; Lucas-Lledó, José Ignacio; Cáceres, Mario

    2017-02-01

    The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF = 0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. High-resolution genomic fingerprinting of Campylobacter jejuni and Campylobacter coli by analysis of amplified fragment length polymorphisms

    DEFF Research Database (Denmark)

    Kokotovic, Branko; On, Stephen L.W.

    1999-01-01

    A method for high-resolution genomic fingerprinting of the enteric pathogens Campylobacter jejuni and Campylobacter coli, based on the determination of amplified fragment length polymorphism, is described. The potential of this method for molecular epidemiological studies of these species...... to available epidemiological data. We conclude that this amplified fragment length polymorphism fingerprinting method may be a highly effective tool for molecular epidemiological studies of Campylobacter spp....

  12. Using Genomics for Natural Product Structure Elucidation.

    Science.gov (United States)

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques.

  13. A data management system for structural genomics

    Directory of Open Access Journals (Sweden)

    O'Toole Nicholas

    2004-06-01

    Full Text Available Abstract Background Structural genomics (SG projects aim to determine thousands of protein structures by the development of high-throughput techniques for all steps of the experimental structure determination pipeline. Crucial to the success of such endeavours is the careful tracking and archiving of experimental and external data on protein targets. Results We have developed a sophisticated data management system for structural genomics. Central to the system is an Oracle-based, SQL-interfaced database. The database schema deals with all facets of the structure determination process, from target selection to data deposition. Users access the database via any web browser. Experimental data is input by users with pre-defined web forms. Data can be displayed according to numerous criteria. A list of all current target proteins can be viewed, with links for each target to associated entries in external databases. To avoid unnecessary work on targets, our data management system matches protein sequences weekly using BLAST to entries in the Protein Data Bank and to targets of other SG centers worldwide. Conclusion Our system is a working, effective and user-friendly data management tool for structural genomics projects. In this report we present a detailed summary of the various capabilities of the system, using real target data as examples, and indicate our plans for future enhancements.

  14. Interrogating the druggable genome with structural informatics.

    Science.gov (United States)

    Hambly, Kevin; Danzer, Joseph; Muskal, Steven; Debe, Derek A

    2006-08-01

    Structural genomics projects are producing protein structure data at an unprecedented rate. In this paper, we present the Target Informatics Platform (TIP), a novel structural informatics approach for amplifying the rapidly expanding body of experimental protein structure information to enhance the discovery and optimization of small molecule protein modulators on a genomic scale. In TIP, existing experimental structure information is augmented using a homology modeling approach, and binding sites across multiple target families are compared using a clique detection algorithm. We report here a detailed analysis of the structural coverage for the set of druggable human targets, highlighting drug target families where the level of structural knowledge is currently quite high, as well as those areas where structural knowledge is sparse. Furthermore, we demonstrate the utility of TIP's intra- and inter-family binding site similarity analysis using a series of retrospective case studies. Our analysis underscores the utility of a structural informatics infrastructure for extracting drug discovery-relevant information from structural data, aiding researchers in the identification of lead discovery and optimization opportunities as well as potential "off-target" liabilities.

  15. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    Energy Technology Data Exchange (ETDEWEB)

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  16. Genome-wide development and deployment of informative intron-spanning and intron-length polymorphism markers for genomics-assisted breeding applications in chickpea.

    Science.gov (United States)

    Srivastava, Rishi; Bajaj, Deepak; Sayal, Yogesh K; Meher, Prabina K; Upadhyaya, Hari D; Kumar, Rajendra; Tripathi, Shailesh; Bharadwaj, Chellapilla; Rao, Atmakuri R; Parida, Swarup K

    2016-11-01

    The discovery and large-scale genotyping of informative gene-based markers is essential for rapid delineation of genes/QTLs governing stress tolerance and yield component traits in order to drive genetic enhancement in chickpea. A genome-wide 119169 and 110491 ISM (intron-spanning markers) from 23129 desi and 20386 kabuli protein-coding genes and 7454 in silico InDel (insertion-deletion) (1-45-bp)-based ILP (intron-length polymorphism) markers from 3283 genes were developed that were structurally and functionally annotated on eight chromosomes and unanchored scaffolds of chickpea. A much higher amplification efficiency (83%) and intra-specific polymorphic potential (86%) detected by these markers than that of other sequence-based genetic markers among desi and kabuli chickpea accessions was apparent even by a cost-effective agarose gel-based assay. The genome-wide physically mapped 1718 ILP markers assayed a wider level of functional genetic diversity (19-81%) and well-defined phylogenetics among domesticated chickpea accessions. The gene-derived 1424 ILP markers were anchored on a high-density (inter-marker distance: 0.65cM) desi intra-specific genetic linkage map/functional transcript map (ICC 4958×ICC 2263) of chickpea. This reference genetic map identified six major genomic regions harbouring six robust QTLs mapped on five chromosomes, which explained 11-23% seed weight trait variation (7.6-10.5 LOD) in chickpea. The integration of high-resolution QTL mapping with differential expression profiling detected six including one potential serine carboxypeptidase gene with ILP markers (linked tightly to the major seed weight QTLs) exhibiting seed-specific expression as well as pronounced up-regulation especially in seeds of high (ICC 4958) as compared to low (ICC 2263) seed weight mapping parental accessions. The marker information generated in the present study was made publicly accessible through a user-friendly web-resource, "Chickpea ISM-ILP Marker Database

  17. Exploiting the Repetitive Fraction of the Wheat Genome for High-Throughput Single-Nucleotide Polymorphism Discovery and Genotyping

    Directory of Open Access Journals (Sweden)

    Nelly Cubizolles

    2016-03-01

    Full Text Available Transposable elements (TEs account for more than 80% of the wheat genome. Although they represent a major obstacle for genomic studies, TEs are also a source of polymorphism and consequently of molecular markers such as insertion site-based polymorphism (ISBP markers. Insertion site-based polymorphisms have been found to be a great source of genome-specific single-nucleotide polymorphism (SNPs in the hexaploid wheat ( L. genome. Here, we report on the development of a high-throughput SNP discovery approach based on sequence capture of ISBP markers. By applying this approach to the reference sequence of chromosome 3B from hexaploid wheat, we designed 39,077 SNPs that are evenly distributed along the chromosome. We demonstrate that these SNPs can be efficiently scored with the KASPar (Kompetitive allele-specific polymerase chain reaction genotyping technology. Finally, through genetic diversity and genome-wide association studies, we also demonstrate that ISBP-derived SNPs can be used in marker-assisted breeding programs.

  18. High-throughput Crystallography for Structural Genomics

    Science.gov (United States)

    Joachimiak, Andrzej

    2009-01-01

    Protein X-ray crystallography recently celebrated its 50th anniversary. The structures of myoglobin and hemoglobin determined by Kendrew and Perutz provided the first glimpses into the complex protein architecture and chemistry. Since then, the field of structural molecular biology has experienced extraordinary progress and now over 53,000 proteins structures have been deposited into the Protein Data Bank. In the past decade many advances in macromolecular crystallography have been driven by world-wide structural genomics efforts. This was made possible because of third-generation synchrotron sources, structure phasing approaches using anomalous signal and cryo-crystallography. Complementary progress in molecular biology, proteomics, hardware and software for crystallographic data collection, structure determination and refinement, computer science, databases, robotics and automation improved and accelerated many processes. These advancements provide the robust foundation for structural molecular biology and assure strong contribution to science in the future. In this report we focus mainly on reviewing structural genomics high-throughput X-ray crystallography technologies and their impact. PMID:19765976

  19. Genomic comparison of invasive and rare non-invasive strains reveals Porphyromonas gingivalis genetic polymorphisms

    Directory of Open Access Journals (Sweden)

    Svetlana Dolgilevich

    2011-03-01

    Full Text Available Porphyromonas gingivalis strains are shown to invade human cells in vitro with different invasion efficiencies, varying by up to three orders of magnitude.We tested the hypothesis that invasion-associated interstrain genomic polymorphisms are present in P. gingivalis and that putative invasion-associated genes can contribute to P. gingivalis invasion.Using an invasive (W83 and the only available non-invasive P. gingivalis strain (AJW4 and whole genome microarrays followed by two separate software tools, we carried out comparative genomic hybridization (CGH analysis.We identified 68 annotated and 51 hypothetical open reading frames (ORFs that are polymorphic between these strains. Among these are surface proteins, lipoproteins, capsular polysaccharide biosynthesis enzymes, regulatory and immunoreactive proteins, integrases, and transposases often with abnormal GC content and clustered on the chromosome. Amplification of selected ORFs was used to validate the approach and the selection. Eleven clinical strains were investigated for the presence of selected ORFs. The putative invasion-associated ORFs were present in 10 of the isolates. The invasion ability of three isogenic mutants, carrying deletions in PG0185, PG0186, and PG0982 was tested. The PG0185 (ragA and PG0186 (ragB mutants had 5.1×103-fold and 3.6×103-fold decreased in vitro invasion ability, respectively.The annotation of divergent ORFs suggests deficiency in multiple genes as a basis for P. gingivalis non-invasive phenotype. Access the supplementary material to this article: Supplement, table (see Supplementary files under Reading Tools online.

  20. Comparative genomics of Bacillus anthracis from the wool industry highlights polymorphisms of lineage A.Br.Vollum.

    Science.gov (United States)

    Derzelle, Sylviane; Aguilar-Bultet, Lisandra; Frey, Joachim

    2016-12-01

    With the advent of affordable next-generation sequencing (NGS) technologies, major progress has been made in the understanding of the population structure and evolution of the B. anthracis species. Here we report the use of whole genome sequencing and computer-based comparative analyses to characterize six strains belonging to the A.Br.Vollum lineage. These strains were isolated in Switzerland, in 1981, during iterative cases of anthrax involving workers in a textile plant processing cashmere wool from the Indian subcontinent. We took advantage of the hundreds of currently available B. anthracis genomes in public databases, to investigate the genetic diversity existing within the A.Br.Vollum lineage and to position the six Swiss isolates into the worldwide B. anthracis phylogeny. Thirty additional genomes related to the A.Br.Vollum group were identified by whole-genome single nucleotide polymorphism (SNP) analysis, including two strains forming a new evolutionary branch at the basis of the A.Br.Vollum lineage. This new phylogenetic lineage (termed A.Br.H9401) splits off the branch leading to the A.Br.Vollum group soon after its divergence to the other lineages of the major A clade (i.e. 6 SNPs). The available dataset of A.Br.Vollum genomes were resolved into 2 distinct groups. Isolates from the Swiss wool processing facility clustered together with two strains from Pakistan and one strain of unknown origin isolated from yarn. They were clearly differentiated (69 SNPs) from the twenty-five other A.Br.Vollum strains located on the branch leading to the terminal reference strain A0488 of the lineage. Novel analytic assays specific to these new subgroups were developed for the purpose of rapid molecular epidemiology. Whole genome SNP surveys greatly expand upon our knowledge on the sub-structure of the A.Br.Vollum lineage. Possible origin and route of spread of this lineage worldwide are discussed. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights

  1. Gene Composer in a structural genomics environment

    International Nuclear Information System (INIS)

    Lorimer, Don; Raymond, Amy; Mixon, Mark; Burgin, Alex; Staker, Bart; Stewart, Lance

    2011-01-01

    For structural biology applications, protein-construct engineering is guided by comparative sequence analysis and structural information, which allow the researcher to better define domain boundaries for terminal deletions and nonconserved regions for surface mutants. A database software application called Gene Composer has been developed to facilitate construct design. The structural genomics effort at the Seattle Structural Genomics Center for Infectious Disease (SSGCID) requires the manipulation of large numbers of amino-acid sequences and the underlying DNA sequences which are to be cloned into expression vectors. To improve efficiency in high-throughput protein structure determination, a database software package, Gene Composer, has been developed which facilitates the information-rich design of protein constructs and their underlying gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bioinformatics steps used in modern structure-guided protein engineering and synthetic gene engineering. An example of the structure determination of H1N1 RNA-dependent RNA polymerase PB2 subunit is given

  2. Genome-based polymorphic microsatellite development and validation in the mosquito Aedes aegypti and application to population genetics in Haiti

    Directory of Open Access Journals (Sweden)

    Streit Thomas G

    2009-12-01

    Full Text Available Abstract Background Microsatellite markers have proven useful in genetic studies in many organisms, yet microsatellite-based studies of the dengue and yellow fever vector mosquito Aedes aegypti have been limited by the number of assayable and polymorphic loci available, despite multiple independent efforts to identify them. Here we present strategies for efficient identification and development of useful microsatellites with broad coverage across the Aedes aegypti genome, development of multiplex-ready PCR groups of microsatellite loci, and validation of their utility for population analysis with field collections from Haiti. Results From 79 putative microsatellite loci representing 31 motifs identified in 42 whole genome sequence supercontig assemblies in the Aedes aegypti genome, 33 microsatellites providing genome-wide coverage amplified as single copy sequences in four lab strains, with a range of 2-6 alleles per locus. The tri-nucleotide motifs represented the majority (51% of the polymorphic single copy loci, and none of these was located within a putative open reading frame. Seven groups of 4-5 microsatellite loci each were developed for multiplex-ready PCR. Four multiplex-ready groups were used to investigate population genetics of Aedes aegypti populations sampled in Haiti. Of the 23 loci represented in these groups, 20 were polymorphic with a range of 3-24 alleles per locus (mean = 8.75. Allelic polymorphic information content varied from 0.171 to 0.867 (mean = 0.545. Most loci met Hardy-Weinberg expectations across populations and pairwise FST comparisons identified significant genetic differentiation between some populations. No evidence for genetic isolation by distance was observed. Conclusion Despite limited success in previous reports, we demonstrate that the Aedes aegypti genome is well-populated with single copy, polymorphic microsatellite loci that can be uncovered using the strategy developed here for rapid and efficient

  3. Polymorphic structures of Alzheimer's β-amyloid globulomers.

    Directory of Open Access Journals (Sweden)

    Xiang Yu

    Full Text Available BACKGROUND: Misfolding and self-assembly of Amyloid-β (Aβ peptides into amyloid fibrils is pathologically linked to the development of Alzheimer's disease. Polymorphicstructures derived from monomers to intermediate oligomers, protofilaments, and mature fibrils have been often observed in solution. Some aggregates are on-pathway species to amyloid fibrils, while the others are off-pathway species that do not evolve into amyloid fibrils. Both on-pathway and off-pathway species could be biologically relevant species. But, the lack of atomic-level structural information for these Aβ species leads to the difficulty in the understanding of their biological roles in amyloid toxicity and amyloid formation. METHODS AND FINDINGS: Here, we model a series of molecular structures of Aβ globulomers assembled by monomer and dimer building blocks using our peptide-packing program and explicit-solvent molecular dynamics (MD simulations. Structural and energetic analysis shows that although Aβ globulomers could adopt different energetically favorable but structurally heterogeneous conformations in a rugged energy landscape, they are still preferentially organized by dynamic dimeric subunits with a hydrophobic core formed by the C-terminal residues independence of initial peptide packing and organization. Such structural organizations offer high structural stability by maximizing peptide-peptide association and optimizing peptide-water solvation. Moreover, curved surface, compact size, and less populated β-structure in Aβ globulomers make them difficult to convert into other high-order Aβ aggregates and fibrils with dominant β-structure, suggesting that they are likely to be off-pathway species to amyloid fibrils. These Aβ globulomers are compatible with experimental data in overall size, subunit organization, and molecular weight from AFM images and H/D amide exchange NMR. CONCLUSIONS: Our computationally modeled Aβ globulomers provide useful

  4. Genome-wide generation and use of informative intron-spanning and intron-length polymorphism markers for high-throughput genetic analysis in rice

    Science.gov (United States)

    Badoni, Saurabh; Das, Sweta; Sayal, Yogesh K.; Gopalakrishnan, S.; Singh, Ashok K.; Rao, Atmakuri R.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    We developed genome-wide 84634 ISM (intron-spanning marker) and 16510 InDel-fragment length polymorphism-based ILP (intron-length polymorphism) markers from genes physically mapped on 12 rice chromosomes. These genic markers revealed much higher amplification-efficiency (80%) and polymorphic-potential (66%) among rice accessions even by a cost-effective agarose gel-based assay. A wider level of functional molecular diversity (17–79%) and well-defined precise admixed genetic structure was assayed by 3052 genome-wide markers in a structured population of indica, japonica, aromatic and wild rice. Six major grain weight QTLs (11.9–21.6% phenotypic variation explained) were mapped on five rice chromosomes of a high-density (inter-marker distance: 0.98 cM) genetic linkage map (IR 64 x Sonasal) anchored with 2785 known/candidate gene-derived ISM and ILP markers. The designing of multiple ISM and ILP markers (2 to 4 markers/gene) in an individual gene will broaden the user-preference to select suitable primer combination for efficient assaying of functional allelic variation/diversity and realistic estimation of differential gene expression profiles among rice accessions. The genomic information generated in our study is made publicly accessible through a user-friendly web-resource, “Oryza ISM-ILP marker” database. The known/candidate gene-derived ISM and ILP markers can be enormously deployed to identify functionally relevant trait-associated molecular tags by optimal-resource expenses, leading towards genomics-assisted crop improvement in rice. PMID:27032371

  5. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains

    KAUST Repository

    Preston, Mark D.

    2014-06-13

    Malaria is a major public health problem that is actively being addressed in a global eradication campaign. Increased population mobility through international air travel has elevated the risk of re-introducing parasites to elimination areas and dispersing drug-resistant parasites to new regions. A simple genetic marker that quickly and accurately identifies the geographic origin of infections would be a valuable public health tool for locating the source of imported outbreaks. Here we analyse the mitochondrion and apicoplast genomes of 711 Plasmodium falciparum isolates from 14 countries, and find evidence that they are non-recombining and co-inherited. The high degree of linkage produces a panel of relatively few single-nucleotide polymorphisms (SNPs) that is geographically informative. We design a 23-SNP barcode that is highly predictive (?92%) and easily adapted to aid case management in the field and survey parasite migration worldwide. 2014 Macmillan Publishers Limited. All rights reserved.

  6. Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms.

    Science.gov (United States)

    Taillon-Miller, P; Gu, Z; Li, Q; Hillier, L; Kwok, P Y

    1998-07-01

    An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21-7q22, and 13q12-13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations.

  7. Rapid high resolution single nucleotide polymorphism-comparative genome hybridization mapping in Caenorhabditis elegans.

    Science.gov (United States)

    Flibotte, Stephane; Edgley, Mark L; Maydan, Jason; Taylor, Jon; Zapf, Rick; Waterston, Robert; Moerman, Donald G

    2009-01-01

    We have developed a significantly improved and simplified method for high-resolution mapping of phenotypic traits in Caenorhabditis elegans using a combination of single nucleotide polymorphisms (SNPs) and oligo array comparative genome hybridization (array CGH). We designed a custom oligonucleotide array using a subset of confirmed SNPs between the canonical wild-type Bristol strain N2 and the Hawaiian isolate CB4856, populated with densely overlapping 50-mer probes corresponding to both N2 and CB4856 SNP sequences. Using this method a mutation can be mapped to a resolution of approximately 200 kb in a single genetic cross. Six mutations representing each of the C. elegans chromosomes were detected unambiguously and at high resolution using genomic DNA from populations derived from as few as 100 homozygous mutant segregants of mutant N2/CB4856 heterozygotes. Our method completely dispenses with the PCR, restriction digest, and gel analysis of standard SNP mapping and should be easy to extend to any organism with interbreeding strains. This method will be particularly powerful when applied to difficult or hard-to-map low-penetrance phenotypes. It should also be possible to map polygenic traits using this method.

  8. DELISHUS: an efficient and exact algorithm for genome-wide detection of deletion polymorphism in autism

    Science.gov (United States)

    Aguiar, Derek; Halldórsson, Bjarni V.; Morrow, Eric M.; Istrail, Sorin

    2012-01-01

    Motivation: The understanding of the genetic determinants of complex disease is undergoing a paradigm shift. Genetic heterogeneity of rare mutations with deleterious effects is more commonly being viewed as a major component of disease. Autism is an excellent example where research is active in identifying matches between the phenotypic and genomic heterogeneities. A considerable portion of autism appears to be correlated with copy number variation, which is not directly probed by single nucleotide polymorphism (SNP) array or sequencing technologies. Identifying the genetic heterogeneity of small deletions remains a major unresolved computational problem partly due to the inability of algorithms to detect them. Results: In this article, we present an algorithmic framework, which we term DELISHUS, that implements three exact algorithms for inferring regions of hemizygosity containing genomic deletions of all sizes and frequencies in SNP genotype data. We implement an efficient backtracking algorithm—that processes a 1 billion entry genome-wide association study SNP matrix in a few minutes—to compute all inherited deletions in a dataset. We further extend our model to give an efficient algorithm for detecting de novo deletions. Finally, given a set of called deletions, we also give a polynomial time algorithm for computing the critical regions of recurrent deletions. DELISHUS achieves significantly lower false-positive rates and higher power than previously published algorithms partly because it considers all individuals in the sample simultaneously. DELISHUS may be applied to SNP array or sequencing data to identify the deletion spectrum for family-based association studies. Availability: DELISHUS is available at http://www.brown.edu/Research/Istrail_Lab/. Contact: Eric_Morrow@brown.edu and Sorin_Istrail@brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22689755

  9. Whole Genome Sequencing Reveals the Islands of Novel Polymorphisms in Two Native Aromatic Japonica Rice Landraces from Vietnam.

    Science.gov (United States)

    Trung, Khuat Huu; Nguyen, Truong Khoa; Khuat, Hoang Bao Truc; Nguyen, Thuy Diep; Khanh, Tran Dang; Xuan, Tran Dang; Nguyen, Xuan-Hung

    2017-06-01

    Elucidation of the rice genome will not only broaden our understanding of genetic characterization of the agronomic characteristics but also facilitate the rice genetic improvement through marker assisted breeding. However, the genome resources of aromatic rice varieties are largely unexploited. Therefore, the whole genome of two elite aromatic traditional japonica rice landraces in North Vietnam, Tam Xoan Bac Ninh (TXBN), and Tam Xoan Hai Hau (TXHH), was sequenced to identify their genome-wide polymorphisms. Overall, we identified over 40,000 novel polymorphisms in each aromatic rice landrace. Although a discontinuous 8-bp deletion and an A/T SNP just upstream the 5-bp deletion in exon 7 of BADH2 gene were present in both rice landraces, the number of SNP high resolution regions of TXBN was six times higher than that of TXHH. Furthermore, several hot spot regions of novel SNPs and indels were found in both genomes, providing their potential gene pools related to aroma formation. The genomic information of two aromatic rice landraces described in this study will facilitate the identification of fragrance-related genes and the genetic improvement of rice. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    Science.gov (United States)

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  11. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  12. B genome specific polymorphism in the TdDRF1 gene is in relationship with grain yield.

    Science.gov (United States)

    Cantale, Cristina; Di Bianco, Domenico; Thiyagarajan, Karthikeyan; Ammar, Karim; Galeffi, Patrizia

    2018-02-01

    A and B genome copies of DRF1 gene in durum wheat were isolated and sequenced using gene variability. B genome specific polymorphism resulted, in a RIL population, in relationship with grain yield mainly in drought condition. Drought tolerance is one of the main components of yield potential and stability, and its improvement is a major challenge to breeders. Transcription factors are considered among the best candidate genes for developing functional markers, since they are components of the signal transduction pathways that coordinate the expression of several downstream genes. Polymorphisms of the Triticum durum dehydration responsive factor 1 (TdDRF1) gene that belongs to DREB2 transcription factor family were identified and specifically assigned to the A or B genome. A panel of primers was derived to selectively isolate the corresponding gene copies. These molecular information were also used to develop a new molecular marker: an allele-specific PCR assay discriminating two genotypes (Mohawk and Cocorit) was developed and used for screening a durum wheat recombinant inbred line population (RIL-pop) derived from the above genotypes. Phenotypic data from the RIL-pop grown during two seasons, under different environmental conditions, adopting an α-lattice design with two repetitions, were collected, analyzed and correlated with molecular data from the PCR assay. A significant association between a specific polymorphism in the B genome copy of the TdDRF1 gene and the grain yield in drought conditions were observed.

  13. High-resolution haplotype block structure in the cattle genome

    Directory of Open Access Journals (Sweden)

    Choi Jungwoo

    2009-04-01

    Full Text Available Abstract Background The Bovine HapMap Consortium has generated assay panels to genotype ~30,000 single nucleotide polymorphisms (SNPs from 501 animals sampled from 19 worldwide taurine and indicine breeds, plus two outgroup species (Anoa and Water Buffalo. Within the larger set of SNPs we targeted 101 high density regions spanning up to 7.6 Mb with an average density of approximately one SNP per 4 kb, and characterized the linkage disequilibrium (LD and haplotype block structure within individual breeds and groups of breeds in relation to their geographic origin and use. Results From the 101 targeted high-density regions on bovine chromosomes 6, 14, and 25, between 57 and 95% of the SNPs were informative in the individual breeds. The regions of high LD extend up to ~100 kb and the size of haplotype blocks ranges between 30 bases and 75 kb (10.3 kb average. On the scale from 1–100 kb the extent of LD and haplotype block structure in cattle has high similarity to humans. The estimation of effective population sizes over the previous 10,000 generations conforms to two main events in cattle history: the initiation of cattle domestication (~12,000 years ago, and the intensification of population isolation and current population bottleneck that breeds have experienced worldwide within the last ~700 years. Haplotype block density correlation, block boundary discordances, and haplotype sharing analyses were consistent in revealing unexpected similarities between some beef and dairy breeds, making them non-differentiable. Clustering techniques permitted grouping of breeds into different clades given their similarities and dissimilarities in genetic structure. Conclusion This work presents the first high-resolution analysis of haplotype block structure in worldwide cattle samples. Several novel results were obtained. First, cattle and human share a high similarity in LD and haplotype block structure on the scale of 1–100 kb. Second, unexpected

  14. Synthesis, structure and electronic structure of a new polymorph of CaGe2

    International Nuclear Information System (INIS)

    Tobash, Paul H.; Bobev, Svilen

    2007-01-01

    Reported are the flux synthesis, the crystal structure determination, the properties and the band structure calculations of a new polymorph of CaGe 2 , which crystallizes with the hexagonal space group P6 3 mc (no. 186) with cell parameters of a=3.9966(9) and c=10.211(4)A (Z=2; Pearson's code hP6). The structure can be viewed as puckered layers of three-bonded germanium atoms, ∼ 2 [Ge 2 ] 2- , which are stacked along the direction of the c-axis in an ABAB-fashion. The germanium polyanionic layers are separated by the Ca cations. As such, this structure is closely related to the structure of the other CaGe 2 polymorph, which crystallizes with the rhombohedral CaSi 2 type in the R3-bar m space group (No. 166), where the ∼ 2 [Ge 2 ] 2- layers are arranged in an AA'BB'CC'-fashion, and are also interspaced by Ca 2+ cations. LMTO calculations suggest that in spite of the formal closed-shell configuration for all atoms and the apparent adherence to the Zintl rules for electron counting, i.e., Ca 2+ [3b-Ge 1- ] 2 ), the phase will be a poor metal due to a small Ca-3d-Ge-4p band overlap. Magnetic susceptibility measurements as a function of the temperature indicate that the new CaGe 2 polymorph exhibits weak, temperature independent, Pauli-paramagnetism

  15. Single nucleotide polymorphism discovery in cutthroat trout subspecies using genome reduction, barcoding, and 454 pyro-sequencing

    Directory of Open Access Journals (Sweden)

    Houston Derek D

    2012-12-01

    Full Text Available Abstract Background Salmonids are popular sport fishes, and as such have been subjected to widespread stocking throughout western North America. Historically, stocking was done with little regard for genetic variation among populations and has resulted in genetic mixing among species and subspecies in many areas, thus putting the genetic integrity of native salmonid populations at risk and creating a need to assess the genetic constitution of native salmonid populations. Cutthroat trout is a salmonid species with pronounced geographic structure (there are 10 extant subspecies and a recent history of hybridization with introduced rainbow trout in many populations. Genetic admixture has also occurred among cutthroat trout subspecies in areas where introductions have brought two or more subspecies into contact. Consequently, management agencies have increased their efforts to evaluate the genetic composition of cutthroat trout populations to identify populations that remain uncompromised and manage them accordingly, but additional genetic markers are needed to do so effectively. Here we used genome reduction, MID-barcoding, and 454-pyrosequencing to discover single nucleotide polymorphisms that differentiate cutthroat trout subspecies and can be used as a rapid, cost-effective method to characterize the genetic composition of cutthroat trout populations. Results Thirty cutthroat and six rainbow trout individuals were subjected to genome reduction and next-generation sequencing. A total of 1,499,670 reads averaging 379 base pairs in length were generated by 454-pyrosequencing, resulting in 569,060,077 total base pairs sequenced. A total of 43,558 putative SNPs were identified, and of those, 125 SNP primers were developed that successfully amplified 96 cutthroat trout and rainbow trout individuals. These SNP loci were able to differentiate most cutthroat trout subspecies using distance methods and Structure analyses. Conclusions Genomic and

  16. Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a web-based resource

    Directory of Open Access Journals (Sweden)

    Vergnaud Gilles

    2004-01-01

    Full Text Available Abstract Background Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison. Results In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors. Conclusions We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial

  17. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  18. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  19. EFIN: predicting the functional impact of nonsynonymous single nucleotide polymorphisms in human genome.

    Science.gov (United States)

    Zeng, Shuai; Yang, Jing; Chung, Brian Hon-Yin; Lau, Yu Lung; Yang, Wanling

    2014-06-10

    Predicting the functional impact of amino acid substitutions (AAS) caused by nonsynonymous single nucleotide polymorphisms (nsSNPs) is becoming increasingly important as more and more novel variants are being discovered. Bioinformatics analysis is essential to predict potentially causal or contributing AAS to human diseases for further analysis, as for each genome, thousands of rare or private AAS exist and only a very small number of which are related to an underlying disease. Existing algorithms in this field still have high false prediction rate and novel development is needed to take full advantage of vast amount of genomic data. Here we report a novel algorithm that features two innovative changes: 1. making better use of sequence conservation information by grouping the homologous protein sequences into six blocks according to evolutionary distances to human and evaluating sequence conservation in each block independently, and 2. including as many such homologous sequences as possible in analyses. Random forests are used to evaluate sequence conservation in each block and to predict potential impact of an AAS on protein function. Testing of this algorithm on a comprehensive dataset showed significant improvement on prediction accuracy upon currently widely-used programs. The algorithm and a web-based application tool implementing it, EFIN (Evaluation of Functional Impact of Nonsynonymous SNPs) were made freely available (http://paed.hku.hk/efin/) to the public. Grouping homologous sequences into different blocks according to the evolutionary distance of the species to human and evaluating sequence conservation in each group independently significantly improved prediction accuracy. This approach may help us better understand the roles of genetic variants in human disease and health.

  20. Genetic analysis of glucosinolate variability in broccoli florets using genome-anchored single nucleotide polymorphisms.

    Science.gov (United States)

    Brown, Allan F; Yousef, Gad G; Reid, Robert W; Chebrolu, Kranthi K; Thomas, Aswathy; Krueger, Christopher; Jeffery, Elizabeth; Jackson, Eric; Juvik, John A

    2015-07-01

    The identification of genetic factors influencing the accumulation of individual glucosinolates in broccoli florets provides novel insight into the regulation of glucosinolate levels in Brassica vegetables and will accelerate the development of vegetables with glucosinolate profiles tailored to promote human health. Quantitative trait loci analysis of glucosinolate (GSL) variability was conducted with a B. oleracea (broccoli) mapping population, saturated with single nucleotide polymorphism markers from a high-density array designed for rapeseed (Brassica napus). In 4 years of analysis, 14 QTLs were associated with the accumulation of aliphatic, indolic, or aromatic GSLs in floret tissue. The accumulation of 3-carbon aliphatic GSLs (2-propenyl and 3-methylsulfinylpropyl) was primarily associated with a single QTL on C05, but common regulation of 4-carbon aliphatic GSLs was not observed. A single locus on C09, associated with up to 40 % of the phenotypic variability of 2-hydroxy-3-butenyl GSL over multiple years, was not associated with the variability of precursor compounds. Similarly, QTLs on C02, C04, and C09 were associated with 4-methylsulfinylbutyl GSL concentration over multiple years but were not significantly associated with downstream compounds. Genome-specific SNP markers were used to identify candidate genes that co-localized to marker intervals and previously sequenced Brassica oleracea BAC clones containing known GSL genes (GSL-ALK, GSL-PRO, and GSL-ELONG) were aligned to the genomic sequence, providing support that at least three of our 14 QTLs likely correspond to previously identified GSL loci. The results demonstrate that previously identified loci do not fully explain GSL variation in broccoli. The identification of additional genetic factors influencing the accumulation of GSL in broccoli florets provides novel insight into the regulation of GSL levels in Brassicaceae and will accelerate development of vegetables with modified or enhanced GSL

  1. Overlapping Genomic Sequences: A Treasure Trove of Single-Nucleotide Polymorphisms

    Science.gov (United States)

    Taillon-Miller, Patricia; Gu, Zhijie; Li, Qun; Hillier, LaDeana; Kwok, Pui-Yan

    1998-01-01

    An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21–7q22, and 13q12–13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AC003015 (for GS113423), AC002380 (GS330J10), AC000066 (RG293F11), AC003086 (RG104F04), AC002525 (257C22A), and U73331 (96A18A).] PMID:9685323

  2. A Plasmodium falciparum FcB1-schizont-EST collection providing clues to schizont specific gene structure and polymorphism

    Directory of Open Access Journals (Sweden)

    Charneau Sébastien

    2009-05-01

    Full Text Available Abstract Background The Plasmodium falciparum genome (3D7 strain published in 2002, revealed ~5,400 genes, mostly based on in silico predictions. Experimental data is therefore required for structural and functional assessments of P. falciparum genes and expression, and polymorphic data are further necessary to exploit genomic information to further qualify therapeutic target candidates. Here, we undertook a large scale analysis of a P. falciparum FcB1-schizont-EST library previously constructed by suppression subtractive hybridization (SSH to study genes expressed during merozoite morphogenesis, with the aim of: 1 obtaining an exhaustive collection of schizont specific ESTs, 2 experimentally validating or correcting P. falciparum gene models and 3 pinpointing genes displaying protein polymorphism between the FcB1 and 3D7 strains. Results A total of 22,125 clones randomly picked from the SSH library were sequenced, yielding 21,805 usable ESTs that were then clustered on the P. falciparum genome. This allowed identification of 243 protein coding genes, including 121 previously annotated as hypothetical. Statistical analysis of GO terms, when available, indicated significant enrichment in genes involved in "entry into host-cells" and "actin cytoskeleton". Although most ESTs do not span full-length gene reading frames, detailed sequence comparison of FcB1-ESTs versus 3D7 genomic sequences allowed the confirmation of exon/intron boundaries in 29 genes, the detection of new boundaries in 14 genes and identification of protein polymorphism for 21 genes. In addition, a large number of non-protein coding ESTs were identified, mainly matching with the two A-type rRNA units (on chromosomes 5 and 7 and to a lower extent, two atypical rRNA loci (on chromosomes 1 and 8, TARE subtelomeric regions (several chromosomes and the recently described telomerase RNA gene (chromosome 9. Conclusion This FcB1-schizont-EST analysis confirmed the actual expression of 243

  3. Genetic predisposition to neuroblastoma mediated by a LMO1 super-enhancer polymorphism | Office of Cancer Genomics

    Science.gov (United States)

    Neuroblastoma is a paediatric malignancy that typically arises in early childhood, and is derived from the developing sympathetic nervous system. Clinical phenotypes range from localized tumours with excellent outcomes to widely metastatic disease in which long-term survival is approximately 40% despite intensive therapy. A previous genome-wide association study identified common polymorphisms at the LMO1 gene locus that are highly associated with neuroblastoma susceptibility and oncogenic addiction to LMO1 in the tumour cells.

  4. Genetic structure of Balearic honeybee populations based on microsatellite polymorphism

    Directory of Open Access Journals (Sweden)

    Moritz Robin FA

    2003-05-01

    Full Text Available Abstract The genetic variation of honeybee colonies collected in 22 localities on the Balearic Islands (Spain was analysed using eight polymorphic microsatellite loci. Previous studies have demonstrated that these colonies belong either to the African or west European evolutionary lineages. These populations display low variability estimated from both the number of alleles and heterozygosity values, as expected for the honeybee island populations. Although genetic differentiation within the islands is low, significant heterozygote deficiency is present, indicating a subpopulation genetic structure. According to the genetic differentiation test, the honeybee populations of the Balearic Islands cluster into two groups: Gimnesias (Mallorca and Menorca and Pitiusas (Ibiza and Formentera, which agrees with the biogeography postulated for this archipelago. The phylogenetic analysis suggests an Iberian origin of the Balearic honeybees, thus confirming the postulated evolutionary scenario for Apis mellifera in the Mediterranean basin. The microsatellite data from Formentera, Ibiza and Menorca show that ancestral populations are threatened by queen importations, indicating that adequate conservation measures should be developed for protecting Balearic bees.

  5. Frequency ofTNFA,INFG, andIL10Gene Polymorphisms and Their Association with MalariaVivaxand Genomic Ancestry.

    Science.gov (United States)

    Furini, Adriana Antônia da Cruz; Cassiano, Gustavo Capatti; Petrolini Capobianco, Marcela; Dos Santos, Sidney Emanuel Batista; Dantas Machado, Ricardo Luiz

    2016-01-01

    Polymorphisms in cytokine genes can alter the production of these proteins and consequently affect the immune response. The trihybrid heterogeneity of the Brazilian population is characterized as a condition for the use of ancestry informative markers. The objective of this study was to evaluate the frequency of - 1031T>C , -308G>A and -238G>A TNFA , +874 A>T IFNG and - 819C>T, and -592C>A IL10 gene polymorphisms and their association with malaria vivax and genomic ancestry. Samples from 90 vivax malaria-infected individuals and 51 noninfected individuals from northern Brazil were evaluated. Genotyping was carried out by using ASO-PCR or PCR/RFLP. The genomic ancestry of the individuals was classified using 48 insertion/deletion polymorphism biallelic markers. There were no differences in the proportions of African, European, and Native American ancestry between men and women. No significant association was observed for the allele and genotype frequencies of the 6 SNPs between malaria-infected and noninfected individuals. However, there was a trend toward decreasing the frequency of individuals carrying the TNF-308A allele with the increasing proportion of European ancestry. No ethnic-specific SNPs were identified, and there was no allelic or genotype association with susceptibility or resistance to vivax malaria. Understanding the genomic mechanisms by which ancestry influences this association is critical and requires further study.

  6. Frequency of TNFA, INFG, and IL10 Gene Polymorphisms and Their Association with Malaria Vivax and Genomic Ancestry

    Directory of Open Access Journals (Sweden)

    Adriana Antônia da Cruz Furini

    2016-01-01

    Full Text Available Polymorphisms in cytokine genes can alter the production of these proteins and consequently affect the immune response. The trihybrid heterogeneity of the Brazilian population is characterized as a condition for the use of ancestry informative markers. The objective of this study was to evaluate the frequency of -1031T>C, -308G>A and -238G>A TNFA, +874 A>T IFNG and -819C>T, and -592C>A IL10 gene polymorphisms and their association with malaria vivax and genomic ancestry. Samples from 90 vivax malaria-infected individuals and 51 noninfected individuals from northern Brazil were evaluated. Genotyping was carried out by using ASO-PCR or PCR/RFLP. The genomic ancestry of the individuals was classified using 48 insertion/deletion polymorphism biallelic markers. There were no differences in the proportions of African, European, and Native American ancestry between men and women. No significant association was observed for the allele and genotype frequencies of the 6 SNPs between malaria-infected and noninfected individuals. However, there was a trend toward decreasing the frequency of individuals carrying the TNF-308A allele with the increasing proportion of European ancestry. No ethnic-specific SNPs were identified, and there was no allelic or genotype association with susceptibility or resistance to vivax malaria. Understanding the genomic mechanisms by which ancestry influences this association is critical and requires further study.

  7. Whole-genome single-nucleotide polymorphism (SNP marker discovery and association analysis with the eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content in Larimichthys crocea

    Directory of Open Access Journals (Sweden)

    Shijun Xiao

    2016-12-01

    Full Text Available Whole-genome single-nucleotide polymorphism (SNP markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms.

  8. Multi-generational imputation of single nucleotide polymorphism marker genotypes and accuracy of genomic selection.

    Science.gov (United States)

    Toghiani, S; Aggrey, S E; Rekaya, R

    2016-07-01

    Availability of high-density single nucleotide polymorphism (SNP) genotyping platforms provided unprecedented opportunities to enhance breeding programmes in livestock, poultry and plant species, and to better understand the genetic basis of complex traits. Using this genomic information, genomic breeding values (GEBVs), which are more accurate than conventional breeding values. The superiority of genomic selection is possible only when high-density SNP panels are used to track genes and QTLs affecting the trait. Unfortunately, even with the continuous decrease in genotyping costs, only a small fraction of the population has been genotyped with these high-density panels. It is often the case that a larger portion of the population is genotyped with low-density and low-cost SNP panels and then imputed to a higher density. Accuracy of SNP genotype imputation tends to be high when minimum requirements are met. Nevertheless, a certain rate of genotype imputation errors is unavoidable. Thus, it is reasonable to assume that the accuracy of GEBVs will be affected by imputation errors; especially, their cumulative effects over time. To evaluate the impact of multi-generational selection on the accuracy of SNP genotypes imputation and the reliability of resulting GEBVs, a simulation was carried out under varying updating of the reference population, distance between the reference and testing sets, and the approach used for the estimation of GEBVs. Using fixed reference populations, imputation accuracy decayed by about 0.5% per generation. In fact, after 25 generations, the accuracy was only 7% lower than the first generation. When the reference population was updated by either 1% or 5% of the top animals in the previous generations, decay of imputation accuracy was substantially reduced. These results indicate that low-density panels are useful, especially when the generational interval between reference and testing population is small. As the generational interval

  9. Child Development and Structural Variation in the Human Genome

    Science.gov (United States)

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  10. Rapid Genome-wide Single Nucleotide Polymorphism Discovery in Soybean and Rice via Deep Resequencing of Reduced Representation Libraries with the Illumina Genome Analyzer

    Directory of Open Access Journals (Sweden)

    Stéphane Deschamps

    2010-07-01

    Full Text Available Massively parallel sequencing platforms have allowed for the rapid discovery of single nucleotide polymorphisms (SNPs among related genotypes within a species. We describe the creation of reduced representation libraries (RRLs using an initial digestion of nuclear genomic DNA with a methylation-sensitive restriction endonuclease followed by a secondary digestion with the 4bp-restriction endonuclease This strategy allows for the enrichment of hypomethylated genomic DNA, which has been shown to be rich in genic sequences, and the digestion with serves to increase the number of common loci resequenced between individuals. Deep resequencing of these RRLs performed with the Illumina Genome Analyzer led to the identification of 2618 SNPs in rice and 1682 SNPs in soybean for two representative genotypes in each of the species. A subset of these SNPs was validated via Sanger sequencing, exhibiting validation rates of 96.4 and 97.0%, in rice ( and soybean (, respectively. Comparative analysis of the read distribution relative to annotated genes in the reference genome assemblies indicated that the RRL strategy was primarily sampling within genic regions for both species. The massively parallel sequencing of methylation-sensitive RRLs for genome-wide SNP discovery can be applied across a wide range of plant species having sufficient reference genomic sequence.

  11. Partial digestion with restriction enzymes of ultraviolet-irradiated human genomic DNA: a method for identifying restriction site polymorphisms

    International Nuclear Information System (INIS)

    Nobile, C.; Romeo, G.

    1988-01-01

    A method for partial digestion of total human DNA with restriction enzymes has been developed on the basis of a principle already utilized by P.A. Whittaker and E. Southern for the analysis of phage lambda recombinants. Total human DNA irradiated with uv light of 254 nm is partially digested by restriction enzymes that recognize sequences containing adjacent thymidines because of TT dimer formation. The products resulting from partial digestion of specific genomic regions are detected in Southern blots by genomic-unique DNA probes with high reproducibility. This procedure is rapid and simple to perform because the same conditions of uv irradiation are used for different enzymes and probes. It is shown that restriction site polymorphisms occurring in the genomic regions analyzed are recognized by the allelic partial digest patterns they determine

  12. Development of an ultra-dense genetic map of the sunflower genome based on single-feature polymorphisms.

    Directory of Open Access Journals (Sweden)

    John E Bowers

    Full Text Available The development of ultra-dense genetic maps has the potential to facilitate detailed comparative genomic analyses and whole genome sequence assemblies. Here we describe the use of a custom Affymetrix GeneChip containing nearly 2.4 million features (25 bp sequences targeting 86,023 unigenes from sunflower (Helianthus annuus L. and related species to test for single-feature polymorphisms (SFPs in a recombinant inbred line (RIL mapping population derived from a cross between confectionery and oilseed sunflower lines (RHA280×RHA801. We then employed an existing genetic map derived from this same population to rigorously filter out low quality data and place 67,486 features corresponding to 22,481 unigenes on the sunflower genetic map. The resulting map contains a substantial fraction of all sunflower genes and will thus facilitate a number of downstream applications, including genome assembly and the identification of candidate genes underlying QTL or traits of interest.

  13. Genome-wide linkage analysis of 972 bipolar pedigrees using single-nucleotide polymorphisms.

    Science.gov (United States)

    Badner, J A; Koller, D; Foroud, T; Edenberg, H; Nurnberger, J I; Zandi, P P; Willour, V L; McMahon, F J; Potash, J B; Hamshere, M; Grozeva, D; Green, E; Kirov, G; Jones, I; Jones, L; Craddock, N; Morris, D; Segurado, R; Gill, M; Sadovnick, D; Remick, R; Keck, P; Kelsoe, J; Ayub, M; MacLean, A; Blackwood, D; Liu, C-Y; Gershon, E S; McMahon, W; Lyon, G J; Robinson, R; Ross, J; Byerley, W

    2012-07-01

    Because of the high costs associated with ascertainment of families, most linkage studies of Bipolar I disorder (BPI) have used relatively small samples. Moreover, the genetic information content reported in most studies has been less than 0.6. Although microsatellite markers spaced every 10 cM typically extract most of the genetic information content for larger multiplex families, they can be less informative for smaller pedigrees especially for affected sib pair kindreds. For these reasons we collaborated to pool family resources and carried out higher density genotyping. Approximately 1100 pedigrees of European ancestry were initially selected for study and were genotyped by the Center for Inherited Disease Research using the Illumina Linkage Panel 12 set of 6090 single-nucleotide polymorphisms. Of the ~1100 families, 972 were informative for further analyses, and mean information content was 0.86 after pruning for linkage disequilibrium. The 972 kindreds include 2284 cases of BPI disorder, 498 individuals with bipolar II disorder (BPII) and 702 subjects with recurrent major depression. Three affection status models (ASMs) were considered: ASM1 (BPI and schizoaffective disorder, BP cases (SABP) only), ASM2 (ASM1 cases plus BPII) and ASM3 (ASM2 cases plus recurrent major depression). Both parametric and non-parametric linkage methods were carried out. The strongest findings occurred at 6q21 (non-parametric pairs LOD 3.4 for rs1046943 at 119 cM) and 9q21 (non-parametric pairs logarithm of odds (LOD) 3.4 for rs722642 at 78 cM) using only BPI and schizoaffective (SA), BP cases. Both results met genome-wide significant criteria, although neither was significant after correction for multiple analyses. We also inspected parametric scores for the larger multiplex families to identify possible rare susceptibility loci. In this analysis, we observed 59 parametric LODs of 2 or greater, many of which are likely to be close to maximum possible scores. Although some linkage

  14. Whole Genome Association Study to Detect Single Nucleotide Polymorphisms for Behavior in Sapsaree Dog (

    Directory of Open Access Journals (Sweden)

    J. H. Ha

    2015-07-01

    Full Text Available The purpose of this study was to characterize genetic architecture of behavior patterns in Sapsaree dogs. The breed population (n = 8,256 has been constructed since 1990 over 12 generations and managed at the Sapsaree Breeding Research Institute, Gyeongsan, Korea. Seven behavioral traits were investigated for 882 individuals. The traits were classified as a quantitative or a categorical group, and heritabilities (h2 and variance components were estimated under the Animal model using ASREML 2.0 software program. In general, the h2 estimates of the traits ranged between 0.00 and 0.16. Strong genetic (rG and phenotypic (rP correlations were observed between nerve stability, affability and adaptability, i.e. 0.9 to 0.94 and 0.46 to 0.68, respectively. To detect significant single nucleotide polymorphism (SNP for the behavioral traits, a total of 134 and 60 samples were genotyped using the Illumina 22K CanineSNP20 and 170K CanineHD bead chips, respectively. Two datasets comprising 60 (Sap60 and 183 (Sap183 samples were analyzed, respectively, of which the latter was based on the SNPs that were embedded on both the 22K and 170K chips. To perform genome-wide association analysis, each SNP was considered with the residuals of each phenotype that were adjusted for sex and year of birth as fixed effects. A least squares based single marker regression analysis was followed by a stepwise regression procedure for the significant SNPs (p<0.01, to determine a best set of SNPs for each trait. A total of 41 SNPs were detected with the Sap183 samples for the behavior traits. The significant SNPs need to be verified using other samples, so as to be utilized to improve behavior traits via marker-assisted selection in the Sapsaree population.

  15. Determining the association between methylenetetrahydrofolate reductase (MTHFR) gene polymorphisms and genomic DNA methylation level: A meta-analysis.

    Science.gov (United States)

    Wang, Li; Shangguan, Shaofang; Chang, Shaoyan; Yu, Xin; Wang, Zhen; Lu, Xiaolin; Wu, Lihua; Zhang, Ting

    2016-08-01

    The methylenetetrahydrofolate reductase (MTHFR) polymorphism is a risk factor for neural tube defects. C677T and A1298C MTHFR polymorphisms produce an enzyme with reduced folate-related one carbon metabolism, and this has been associated with aberrant methylation modifications in DNA and protein. A meta-analysis was conducted to assess the association between MTHFR C677T/A1298C genotypes and global genomic methylation. Eleven studies met the inclusion criteria. Of these, 10 were performed on C677T MTHFR genotypes and 6 were performed on A1298C MTHFR genotypes. Our results did not indicate any correlation between global methylation and MTHFR A1298C, C677T polymorphisms. The results of our study provide evidence to assess the global methylation modification alterations of MTHFR polymorphisms among individuals. However, our data did not found any conceivable proof supporting the hypothesis that common variant of MTHFR A1298C, C677T contributes to methylation modification. Birth Defects Research (Part A) 106:667-674, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  16. Development and Validation of 697 Novel Polymorphic Genomic and EST-SSR Markers in the American Cranberry (Vaccinium macrocarpon Ait.

    Directory of Open Access Journals (Sweden)

    Brandon Schlautman

    2015-01-01

    Full Text Available The American cranberry, Vaccinium macrocarpon Ait., is an economically important North American fruit crop that is consumed because of its unique flavor and potential health benefits. However, a lack of abundant, genome-wide molecular markers has limited the adoption of modern molecular assisted selection approaches in cranberry breeding programs. To increase the number of available markers in the species, this study identified, tested, and validated microsatellite markers from existing nuclear and transcriptome sequencing data. In total, new primers were designed, synthesized, and tested for 979 SSR loci; 697 of the markers amplified allele patterns consistent with single locus segregation in a diploid organism and were considered polymorphic. Of the 697 polymorphic loci, 507 were selected for additional genetic diversity and segregation analyses in 29 cranberry genotypes. More than 95% of the 507 loci did not display segregation distortion at the p < 0.05 level, and contained moderate to high levels of polymorphism with a polymorphic information content >0.25. This comprehensive collection of developed and validated microsatellite loci represents a substantial addition to the molecular tools available for geneticists, genomicists, and breeders in cranberry and Vaccinium.

  17. A resource of genome-wide single-nucleotide polymorphisms generated by RAD tag sequencing in the critically endangered European eel

    DEFF Research Database (Denmark)

    Pujolar, J.M.; Jacobsen, M.W.; Frydenberg, J.

    2013-01-01

    Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the Eu......Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers...... for the European eel using the RAD sequencing approach that was simultaneously identified and scored in a genome-wide scan of 30 individuals. Whereas genomic resources are increasingly becoming available for this species, including the recent release of a draft genome, no genome-wide set of SNP markers...

  18. Application of Genomic SSR Locus Polymorphisms on the Identification and Classification of Chrysanthemum Cultivars in China

    Science.gov (United States)

    Zhang, Yuan; Dai, Silan; Hong, Yan; Song, Xuebin

    2014-01-01

    The Chinese traditional chrysanthemum is a notable group of chrysanthemums (Chrysanthemum×morifolium Ramat.) in which the phenotypic characteristics richly vary. At present, there is a serious controversy regarding homonyms and synonyms within this group. Moreover, the current international chrysanthemum classification systems are not comprehensive enough to be used on Chinese traditional chrysanthemums. Thus, we first identified a broad collection of 480 Chinese traditional chrysanthemum cultivars using the unique DNA fingerprints and molecular identities that were established by 20 simple sequence repeat markers. Five loci, which distinguished all of the selected cultivars, were identified as the core loci to establish unique fingerprints and molecular identities with 19 denary digits for each cultivar. A cluster analysis based on Nei's genetic distance indicated that the selected cultivars were clustered according to their horticultural classification. Population structure analysis was subsequently performed with K values ranging from 2 to 14, and the most likely estimate for the population structure was ten subpopulations, which was nearly consistent with the clustering result. Principal component analysis was further performed to verify the classification results. On the basis of the Q-matrices of K = 10, a total of 19 traits were found to be associated with 42 markers. Taken together, these results can serve as starting points for the identification and classification of chrysanthemums based on the polymorphism of microsatellite markers, which is beneficial to promote the marker-assisted breeding and international communication of this marvelous crop. PMID:25148046

  19. Application of genomic SSR locus polymorphisms on the identification and classification of chrysanthemum cultivars in China.

    Directory of Open Access Journals (Sweden)

    Yuan Zhang

    Full Text Available The Chinese traditional chrysanthemum is a notable group of chrysanthemums (Chrysanthemum×morifolium Ramat. in which the phenotypic characteristics richly vary. At present, there is a serious controversy regarding homonyms and synonyms within this group. Moreover, the current international chrysanthemum classification systems are not comprehensive enough to be used on Chinese traditional chrysanthemums. Thus, we first identified a broad collection of 480 Chinese traditional chrysanthemum cultivars using the unique DNA fingerprints and molecular identities that were established by 20 simple sequence repeat markers. Five loci, which distinguished all of the selected cultivars, were identified as the core loci to establish unique fingerprints and molecular identities with 19 denary digits for each cultivar. A cluster analysis based on Nei's genetic distance indicated that the selected cultivars were clustered according to their horticultural classification. Population structure analysis was subsequently performed with K values ranging from 2 to 14, and the most likely estimate for the population structure was ten subpopulations, which was nearly consistent with the clustering result. Principal component analysis was further performed to verify the classification results. On the basis of the Q-matrices of K = 10, a total of 19 traits were found to be associated with 42 markers. Taken together, these results can serve as starting points for the identification and classification of chrysanthemums based on the polymorphism of microsatellite markers, which is beneficial to promote the marker-assisted breeding and international communication of this marvelous crop.

  20. Characterization of the complete mitochondrial genome and a set of polymorphic microsatellite markers through next-generation sequencing for the brown brocket deer Mazama gouazoubira

    OpenAIRE

    Renato Caparroz; Aline M.B. Mantellatto; David J. Bertioli; Marina G. Figueiredo; José Maurício B. Duarte

    2015-01-01

    The complete mitochondrial genome of the brown brocket deer Mazama gouazoubira and a set of polymorphic microsatellite markers were identified by 454-pyrosequencing. De novo genome assembly recovered 98% of the mitochondrial genome with a mean coverage of 9-fold. The mitogenome consisted of 16,356 base pairs that included 13 protein-coding genes, two ribosomal subunit genes, 22 transfer RNAs and the control region, as found in other deer. The genetic divergence between the mitogenome describe...

  1. Structural biology at York Structural Biology Laboratory; laboratory information management systems for structural genomics

    Czech Academy of Sciences Publication Activity Database

    Dohnálek, Jan

    2005-01-01

    Roč. 12, č. 1 (2005), s. 3 ISSN 1211-5894. [Meeting of Structural Biologists /4./. 10.03.2005-12.03.2005, Nové Hrady] R&D Projects: GA MŠk(CZ) 1K05008 Keywords : structural biology * LIMS * structural genomics Subject RIV: CD - Macromolecular Chemistry

  2. Detection and validation of single feature polymorphisms in cowpea (Vigna unguiculata L. Walp using a soybean genome array

    Directory of Open Access Journals (Sweden)

    Wanamaker Steve

    2008-02-01

    Full Text Available Abstract Background Cowpea (Vigna unguiculata L. Walp is an important food and fodder legume of the semiarid tropics and subtropics worldwide, especially in sub-Saharan Africa. High density genetic linkage maps are needed for marker assisted breeding but are not available for cowpea. A single feature polymorphism (SFP is a microarray-based marker which can be used for high throughput genotyping and high density mapping. Results Here we report detection and validation of SFPs in cowpea using a readily available soybean (Glycine max genome array. Robustified projection pursuit (RPP was used for statistical analysis using RNA as a surrogate for DNA. Using a 15% outlying score cut-off, 1058 potential SFPs were enumerated between two parents of a recombinant inbred line (RIL population segregating for several important traits including drought tolerance, Fusarium and brown blotch resistance, grain size and photoperiod sensitivity. Sequencing of 25 putative polymorphism-containing amplicons yielded a SFP probe set validation rate of 68%. Conclusion We conclude that the Affymetrix soybean genome array is a satisfactory platform for identification of some 1000's of SFPs for cowpea. This study provides an example of extension of genomic resources from a well supported species to an orphan crop. Presumably, other legume systems are similarly tractable to SFP marker development using existing legume array resources.

  3. Genome structure analysis of molluscs revealed whole genome duplication and lineage specific repeat variation.

    Science.gov (United States)

    Yoshida, Masa-aki; Ishikura, Yukiko; Moritaki, Takeya; Shoguchi, Eiichi; Shimizu, Kentaro K; Sese, Jun; Ogura, Atsushi

    2011-09-01

    Comparative genome structure analysis allows us to identify novel genes, repetitive sequences and gene duplications. To explore lineage-specific genomic changes of the molluscs that is good model for development of nervous system in invertebrate, we conducted comparative genome structure analyses of three molluscs, pygmy squid, nautilus and scallops using partial genome shotgun sequencing. Most effective elements on the genome structural changes are repetitive elements (REs) causing expansion of genome size and whole genome duplication producing large amount of novel functional genes. Therefore, we investigated variation and proportion of REs and whole genome duplication. We, first, identified variations of REs in the three molluscan genomes by homology-based and de novo RE detection. Proportion of REs were 9.2%, 4.0%, and 3.8% in the pygmy squid, nautilus and scallop, respectively. We, then, estimated genome size of the species as 2.1, 4.2 and 1.8 Gb, respectively, with 2× coverage frequency and DNA sequencing theory. We also performed a gene duplication assay based on coding genes, and found that large-scale duplication events occurred after divergence from the limpet Lottia, an out-group of the three molluscan species. Comparison of all the results suggested that RE expansion did not relate to the increase in genome size of nautilus. Despite close relationships to nautilus, the squid has the largest portion of REs and smaller genome size than nautilus. We also identified lineage-specific RE and gene-family expansions, possibly relate to acquisition of the most complicated eye and brain systems in the three species. Copyright © 2011 Elsevier B.V. All rights reserved.

  4. Genome-wide analysis of intraspecific DNA polymorphism in 'Micro-Tom', a model cultivar of tomato (Solanum lycopersicum).

    Science.gov (United States)

    Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh

    2014-02-01

    Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.

  5. Polymorphism and Genetic Structure of LGB Gene (Rsai in Valachian Sheep Population

    Directory of Open Access Journals (Sweden)

    Martina Miluchová

    2011-05-01

    Full Text Available The work was oriented to identification of beta-lactoglobulin gene (LGB polymorphism and analysis of genotype structure in population of Valachian sheep. LGB is the major milk whey protein in the ruminants. AA and AB genotypes are associated with protein and casein content and curd yield. BB LGB genotype sheep were characterized by a significantly higher protein content of milk than sheep with the other two LGB genotypes — AA and AB. LGB variant AB is associated with higher body weight, while genotype AA could be linked with sheep wool density. The material involved 34 Valachian sheep. Ovine genomic DNA was isolated by salting out method and used in order to estimate LGB genotypes by PCR-RFLP method. The PCR products were digested with RsaI restriction enzyme. In the population included in the study, we detected all genotypes: homozygote genotype AA (6 animals, heterozygote genotype AB (12 animals and homozygote genotype BB (16 animals. In the population of sheep homozygotes BB – 0.4706 were the most frequent, while homozygotes AA – 0.1765 were the least frequent ones. This suggests a slight superiority of allele B – 0.6471.

  6. Genome Structure of the Genus Azospirillum

    Science.gov (United States)

    Martin-Didonet, Claudia C. G.; Chubatsu, Leda S.; Souza, Emanuel M.; Kleina, Margareth; Rego, Fabiane G. M.; Rigo, Liu U.; Yates, M. Geoffrey; Pedrosa, Fabio O.

    2000-01-01

    Azospirillum species are plant-associated diazotrophs of the alpha subclass of Proteobacteria. The genomes of five of the six Azospirillum species were analyzed by pulsed-field gel electrophoresis. All strains possessed several megareplicons, some probably linear, and 16S ribosomal DNA hybridization indicated multiple chromosomes in genomes ranging in size from 4.8 to 9.7 Mbp. The nifHDK operon was identified in the largest replicon. PMID:10869094

  7. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  8. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene

  9. Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs and an online database.

    Directory of Open Access Journals (Sweden)

    Christopher A Raistrick

    2010-10-01

    Full Text Available Variation in pre-mRNA splicing is common and in some cases caused by genetic variants in intronic splicing motifs. Recent studies into the insulin gene (INS discovered a polymorphism in a 5' non-coding intron that influences the likelihood of intron retention in the final mRNA, extending the 5' untranslated region and maintaining protein quality. Retention was also associated with increased insulin levels, suggesting that such variants--splice translational efficiency polymorphisms (STEPs--may relate to disease phenotypes through differential protein expression. We set out to explore the prevalence of STEPs in the human genome and validate this new category of protein quantitative trait loci (pQTL using publicly available data.Gene transcript and variant data were collected and mined for candidate STEPs in motif regions. Sequences from transcripts containing potential STEPs were analysed for evidence of splice site recognition and an effect in expressed sequence tags (ESTs. 16 publicly released genome-wide association data sets of common diseases were searched for association to candidate polymorphisms with HapMap frequency data. Our study found 3324 candidate STEPs lying in motif sequences of 5' non-coding introns and further mining revealed 170 with transcript evidence of intron retention. 21 potential STEPs had EST evidence of intron retention or exon extension, as well as population frequency data for comparison.Results suggest that the insulin STEP was not a unique example and that many STEPs may occur genome-wide with potentially causal effects in complex disease. An online database of STEPs is freely accessible at http://dbstep.genes.org.uk/.

  10. Polymorphism identification and improved genome annotation of Brassica rapa through Deep RNA sequencing.

    Science.gov (United States)

    Devisetty, Upendra Kumar; Covington, Michael F; Tat, An V; Lekkala, Saradadevi; Maloof, Julin N

    2014-08-12

    The mapping and functional analysis of quantitative traits in Brassica rapa can be greatly improved with the availability of physically positioned, gene-based genetic markers and accurate genome annotation. In this study, deep transcriptome RNA sequencing (RNA-Seq) of Brassica rapa was undertaken with two objectives: SNP detection and improved transcriptome annotation. We performed SNP detection on two varieties that are parents of a mapping population to aid in development of a marker system for this population and subsequent development of high-resolution genetic map. An improved Brassica rapa transcriptome was constructed to detect novel transcripts and to improve the current genome annotation. This is useful for accurate mRNA abundance and detection of expression QTL (eQTLs) in mapping populations. Deep RNA-Seq of two Brassica rapa genotypes-R500 (var. trilocularis, Yellow Sarson) and IMB211 (a rapid cycling variety)-using eight different tissues (root, internode, leaf, petiole, apical meristem, floral meristem, silique, and seedling) grown across three different environments (growth chamber, greenhouse and field) and under two different treatments (simulated sun and simulated shade) generated 2.3 billion high-quality Illumina reads. A total of 330,995 SNPs were identified in transcribed regions between the two genotypes with an average frequency of one SNP in every 200 bases. The deep RNA-Seq reassembled Brassica rapa transcriptome identified 44,239 protein-coding genes. Compared with current gene models of B. rapa, we detected 3537 novel transcripts, 23,754 gene models had structural modifications, and 3655 annotated proteins changed. Gaps in the current genome assembly of B. rapa are highlighted by our identification of 780 unmapped transcripts. All the SNPs, annotations, and predicted transcripts can be viewed at http://phytonetworks.ucdavis.edu/. Copyright © 2014 Devisetty et al.

  11. Alu polymorphic insertions reveal genetic structure of north Indian populations.

    Science.gov (United States)

    Tripathi, Manorama; Tripathi, Piyush; Chauhan, Ugam Kumari; Herrera, Rene J; Agrawal, Suraksha

    2008-10-01

    The Indian subcontinent is characterized by the ancestral and cultural diversity of its people. Genetic input from several unique source populations and from the unique social architecture provided by the caste system has shaped the current genetic landscape of India. In the present study 200 individuals each from three upper-caste and four middle-caste Hindu groups and from two Muslim populations in North India were examined for 10 polymorphic Alu insertions (PAIs). The investigated PAIs exhibit high levels of polymorphism and average heterozygosity. Limited interpopulation variance and genetic flow in the present study suggest admixture. The results of this study demonstrate that, contrary to common belief, the caste system has not provided an impermeable barrier to genetic exchange among Indian groups.

  12. Genome Editing of Structural Variations: Modeling and Gene Correction.

    Science.gov (United States)

    Park, Chul-Yong; Sung, Jin Jea; Kim, Dong-Wook

    2016-07-01

    The analysis of chromosomal structural variations (SVs), such as inversions and translocations, was made possible by the completion of the human genome project and the development of genome-wide sequencing technologies. SVs contribute to genetic diversity and evolution, although some SVs can cause diseases such as hemophilia A in humans. Genome engineering technology using programmable nucleases (e.g., ZFNs, TALENs, and CRISPR/Cas9) has been rapidly developed, enabling precise and efficient genome editing for SV research. Here, we review advances in modeling and gene correction of SVs, focusing on inversion, translocation, and nucleotide repeat expansion. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Genome analysis of yellow fever virus of the ongoing outbreak in Brazil reveals polymorphisms

    Directory of Open Access Journals (Sweden)

    Myrna C Bonaldo

    Full Text Available The current yellow fever outbreak in Brazil is the most severe one in the country in recent times. It has rapidly spread to areas where YF virus (YFV activity has not been observed for more than 70 years and vaccine coverage is almost null. Here, we sequenced the whole YFV genome of two naturally infected howler-monkeys (Alouatta clamitans obtained from the Municipality of Domingos Martins, state of Espírito Santo, Brazil. These two ongoing-outbreak genome sequences are identical. They clustered in the 1E sub-clade (South America genotype I along with the Brazilian and Venezuelan strains recently characterised from infections in humans and non-human primates that have been described in the last 20 years. However, we detected eight unique amino acid changes in the viral proteins, including the structural capsid protein (one change, and the components of the viral replicase complex, the NS3 (two changes and NS5 (five changes proteins, that could impact the capacity of viral infection in vertebrate and/or invertebrate hosts and spreading of the ongoing outbreak.

  14. Allele polymorphism of Nad1 gene of the Serbian spruce mitochondrial genome

    Directory of Open Access Journals (Sweden)

    Milovanović Jelena

    2007-01-01

    Full Text Available Serbian spruce (Picea omorika /Panč./Purkyne, as the Balkan Peninsula endemic and the Tertiary relic, is a species whose survival is threatened by the constant restriction of its range caused by the global changes of environmental conditions and the adverse human impacts. The Serbian spruce seedling seed orchard at Godovik represents the base for the improvement of the production of the selected seeds of this species, which can be used as the initial material for the extension of its range. The allele polymorphism of the mitochondrial nad1 gene was analyzed in five different Serbian spruce phenogroups of which the orchard is established. The obtained results are a contribution to a closer study of the causes of the postglacial intraspecific differentiation of Serbian spruce and the creation of the above phenogroups. The study results are significant for further breeding of this species based on the better knowledge of the genetic structure of the species, its directed utilisation and the widening of its range. .

  15. Genomic DNA Enrichment Using Sequence Capture Microarrays: a Novel Approach to Discover Sequence Nucleotide Polymorphisms (SNP) in Brassica napus L

    Science.gov (United States)

    Clarke, Wayne E.; Parkin, Isobel A.; Gajardo, Humberto A.; Gerhardt, Daniel J.; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G.; Snowdon, Rod J.; Federico, Maria L.; Iniguez-Luy, Federico L.

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci –QTL– analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species. PMID:24312619

  16. High density LD-based structural variations analysis in cattle genome.

    Directory of Open Access Journals (Sweden)

    Ricardo Salomon-Torres

    Full Text Available Genomic structural variations represent an important source of genetic variation in mammal genomes, thus, they are commonly related to phenotypic expressions. In this work, ∼ 770,000 single nucleotide polymorphism genotypes from 506 animals from 19 cattle breeds were analyzed. A simple LD-based structural variation was defined, and a genome-wide analysis was performed. After applying some quality control filters, for each breed and each chromosome we calculated the linkage disequilibrium (r2 of short range (≤ 100 Kb. We sorted SNP pairs by distance and obtained a set of LD means (called the expected means using bins of 5 Kb. We identified 15,246 segments of at least 1 Kb, among the 19 breeds, consisting of sets of at least 3 adjacent SNPs so that, for each SNP, r2 within its neighbors in a 100 Kb range, to the right side of that SNP, were all bigger than, or all smaller than, the corresponding expected mean, and their P-value were significant after a Benjamini-Hochberg multiple testing correction. In addition, to account just for homogeneously distributed regions we considered only SNPs having at least 15 SNP neighbors within 100 Kb. We defined such segments as structural variations. By grouping all variations across all animals in the sample we defined 9,146 regions, involving a total of 53,137 SNPs; representing the 6.40% (160.98 Mb from the bovine genome. The identified structural variations covered 3,109 genes. Clustering analysis showed the relatedness of breeds given the geographic region in which they are evolving. In summary, we present an analysis of structural variations based on the deviation of the expected short range LD between SNPs in the bovine genome. With an intuitive and simple definition based only on SNPs data it was possible to discern closeness of breeds due to grouping by geographic region in which they are evolving.

  17. Structural Genomics of Minimal Organisms: Pipeline and Results

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong-Hae; Kim, Rosalind; Adams, Paul; Chandonia, John-Marc

    2007-09-14

    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.

  18. Visualization of RNA structure models within the Integrative Genomics Viewer.

    Science.gov (United States)

    Busan, Steven; Weeks, Kevin M

    2017-07-01

    Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  19. 3D genome structure modeling by Lorentzian objective function.

    Science.gov (United States)

    Trieu, Tuan; Cheng, Jianlin

    2017-02-17

    The 3D structure of the genome plays a vital role in biological processes such as gene interaction, gene regulation, DNA replication and genome methylation. Advanced chromosomal conformation capture techniques, such as Hi-C and tethered conformation capture, can generate chromosomal contact data that can be used to computationally reconstruct 3D structures of the genome. We developed a novel restraint-based method that is capable of reconstructing 3D genome structures utilizing both intra-and inter-chromosomal contact data. Our method was robust to noise and performed well in comparison with a panel of existing methods on a controlled simulated data set. On a real Hi-C data set of the human genome, our method produced chromosome and genome structures that are consistent with 3D FISH data and known knowledge about the human chromosome and genome, such as, chromosome territories and the cluster of small chromosomes in the nucleus center with the exception of the chromosome 18. The tool and experimental data are available at https://missouri.box.com/v/LorDG.

  20. The effect of single nucleotide polymorphisms from genome wide association studies in multiple sclerosis on gene expression.

    Directory of Open Access Journals (Sweden)

    Adam E Handel

    2010-04-01

    Full Text Available Multiple sclerosis (MS is a complex neurological disorder. Its aetiology involves both environmental and genetic factors. Recent genome-wide association studies have identified a number of single nucleotide polymorphisms (SNPs associated with susceptibility to (MS. We investigated whether these genetic variations were associated with alteration in gene expression.We used a database of mRNA expression and genetic variation derived from immortalised peripheral lymphocytes to investigate polymorphisms associated with MS for correlation with gene expression. Several SNPs were found to be associated with changes in expression: in particular two with HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DRB1, HLA-DRB4 and HLA-DRB5, one with ZFP57, one with CD58, two with IL7 and FAM164A, and one with FAM119B, TSFM and KUB3. We found minimal cross-over with a recent whole genome expression study in MS patients.We have shown that many susceptibility loci in MS are associated with changes in gene expression using an unbiased expression database. Several of these findings suggest novel gene candidates underlying the effects of MS-associated genetic variation.

  1. Polymorphism in Elemental Silicon: Probabilistic Interpretation of the Realizability of Metastable Structures

    Energy Technology Data Exchange (ETDEWEB)

    Stevanovic, Vladan [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Jones, Eric [National Renewable Energy Laboratory (NREL), Golden, CO (United States)

    2017-11-03

    With few systems of technological interest having been studied as extensively as elemental silicon, there currently exists a wide disparity between the number of predicted low-energy silicon polymorphs and those that have been experimentally realized as metastable at ambient conditions. We put forward an explanation for this disparity wherein the likelihood of formation of a given polymorph under near-equilibrium conditions can be estimated on the basis of mean-field isothermal-isobaric (N,p,T) ensemble statistics. The probability that a polymorph will be experimentally realized is shown to depend upon both the hypervolume of that structure's potential energy basin of attraction and a Boltzmann factor weight containing the polymorph's potential enthalpy per particle. Both attributes are calculated using density functional theory relaxations of randomly generated initial structures. We find that the metastable polymorphism displayed by silicon can be accounted for using this framework to the exclusion of a very large number of other low-energy structures.

  2. Crystalline structure of the marketed form of Rifampicin: a case of conformational and charge transfer polymorphism

    Science.gov (United States)

    de Pinho Pessoa Nogueira, Luciana; de Oliveira, Yara S.; de C. Fonseca, Jéssica; Costa, Wendell S.; Raffin, Fernanda N.; Ellena, Javier; Ayala, Alejandro Pedro

    2018-03-01

    Rifampicin is a semi-synthetic drug derived from rifamycin B, and currently integrates the fixed dose combination tablet formulations used in the treatment of tuberculosis. It is also used in the leprosy polychemotherapy and prophylaxis, which are diseases classified as neglected according to the World Health Organization. Rifampicin is a polymorphic drug and its desirable polymorphic form is labeled as II, being the main goal of this study the elucidation of its crystalline structure. Polymorph II is characterized by two molecules with different conformations in the asymmetric unit and the following lattice parameters: a = 14.0760 (10) Å, b = 17.5450 (10) Å, c = 17.5270 (10) Å, β = 92.15°. Differently to the previously reported structures, a charge transference from the hydroxyl group of the naphthoquinone of one conformer to the nitrogen of the piperazine group of the second conformer was observed. The relevance of the knowledge of this crystalline structure, which is the preferred polymorph for pharmaceutical formulations, was evidenced by analyzing raw materials with polymorphic mixtures. Thus, the results presented in this contribution close an old information gap allowing the complete solid-state characterization of rifampicin.

  3. Genome Plasticity and Polymorphisms in Critical Genes Correlate with Increased Virulence of Dutch Outbreak-Related Coxiella burnetii Strains

    Directory of Open Access Journals (Sweden)

    Runa Kuley

    2017-08-01

    Full Text Available Coxiella burnetii is an obligate intracellular bacterium and the etiological agent of Q fever. During 2007–2010 the largest Q fever outbreak ever reported occurred in The Netherlands. It is anticipated that strains from this outbreak demonstrated an increased zoonotic potential as more than 40,000 individuals were assumed to be infected. The acquisition of novel genetic factors by these C. burnetii outbreak strains, such as virulence-related genes, has frequently been proposed and discussed, but is not proved yet. In the present study, the whole genome sequence of several Dutch strains (CbNL01 and CbNL12 genotypes, a few additionally selected strains from different geographical locations and publicly available genome sequences were used for a comparative bioinformatics approach. The study focuses on the identification of specific genetic differences in the outbreak related CbNL01 strains compared to other C. burnetii strains. In this approach we investigated the phylogenetic relationship and genomic aspects of virulence and host-specificity. Phylogenetic clustering of whole genome sequences showed a genotype-specific clustering that correlated with the clustering observed using Multiple Locus Variable-number Tandem Repeat Analysis (MLVA. Ortholog analysis on predicted genes and single nucleotide polymorphism (SNP analysis of complete genome sequences demonstrated the presence of genotype-specific gene contents and SNP variations in C. burnetii strains. It also demonstrated that the currently used MLVA genotyping methods are highly discriminatory for the investigated outbreak strains. In the fully reconstructed genome sequence of the Dutch outbreak NL3262 strain of the CbNL01 genotype, a relatively large number of transposon-linked genes were identified as compared to the other published complete genome sequences of C. burnetii. Additionally, large numbers of SNPs in its membrane proteins and predicted virulence-associated genes were identified

  4. The BDNF Val66Met Polymorphism Affects the Vulnerability of the Brain Structural Network

    Directory of Open Access Journals (Sweden)

    Chang-hyun Park

    2017-08-01

    Full Text Available Val66Met, a naturally occurring polymorphism in the human brain-derived neurotrophic factor (BDNF gene resulting in a valine (Val to methionine (Met substitution at codon 66, plays an important role in neuroplasticity. While the effect of the BDNF Val66Met polymorphism on local brain structures has previously been examined, its impact on the configuration of the graph-based white matter structural networks is yet to be investigated. In the current study, we assessed the effect of the BDNF polymorphism on the network properties and robustness of the graph-based white matter structural networks. Graph theory was employed to investigate the structural connectivity derived from white matter tractography in two groups, Val homozygotes (n = 18 and Met-allele carriers (n = 55. Although there were no differences in the global network measures including global efficiency, local efficiency, and modularity between the two genotype groups, we found the effect of the BDNF Val66Met polymorphism on the robustness properties of the white matter structural networks. Specifically, the white matter structural networks of the Met-allele carrier group showed higher vulnerability to targeted removal of central nodes as compared with those of the Val homozygote group. These findings suggest that the central role of the BDNF Val66Met polymorphism in regards to neuroplasticity may be associated with inherent differences in the robustness of the white matter structural network according to the genetic variants. Furthermore, greater susceptibility to brain disorders in Met-allele carriers may be understood as being due to their limited stability in white matter structural connectivity.

  5. Structural dynamics of retroviral genome and the packaging.

    Science.gov (United States)

    Miyazaki, Yasuyuki; Miyake, Ariko; Nomaguchi, Masako; Adachi, Akio

    2011-01-01

    Retroviruses can cause diseases such as AIDS, leukemia, and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5' untranslated region (5' UTR), and contains dimerization site(s). Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5' UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus, human immunodeficiency virus type 1 and 2, and describe the molecular mechanism of retroviral genome packaging.

  6. Structural dynamics of retroviral genome and the packaging

    Directory of Open Access Journals (Sweden)

    Yasuyuki eMiyazaki

    2011-12-01

    Full Text Available Retroviruses can cause diseases such as AIDS, leukemia and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid (NC domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5’ untranslated region (5’ UTR, and contains dimerization site(s. Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5’ UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus (MoMLV, human immunodeficiency virus type 1 (HIV-1 and 2 (HIV-2, and describe the molecular mechanism of retroviral genome packaging.

  7. Structural variation in the chicken genome identified by paired-end next-generation DNA sequencing of reduced representation libraries

    Directory of Open Access Journals (Sweden)

    Okimoto Ron

    2011-02-01

    Full Text Available Abstract Background Variation within individual genomes ranges from single nucleotide polymorphisms (SNPs to kilobase, and even megabase, sized structural variants (SVs, such as deletions, insertions, inversions, and more complex rearrangements. Although much is known about the extent of SVs in humans and mice, species in which they exert significant effects on phenotypes, very little is known about the extent of SVs in the 2.5-times smaller and less repetitive genome of the chicken. Results We identified hundreds of shared and divergent SVs in four commercial chicken lines relative to the reference chicken genome. The majority of SVs were found in intronic and intergenic regions, and we also found SVs in the coding regions. To identify the SVs, we combined high-throughput short read paired-end sequencing of genomic reduced representation libraries (RRLs of pooled samples from 25 individuals and computational mapping of DNA sequences from a reference genome. Conclusion We provide a first glimpse of the high abundance of small structural genomic variations in the chicken. Extrapolating our results, we estimate that there are thousands of rearrangements in the chicken genome, the majority of which are located in non-coding regions. We observed that structural variation contributes to genetic differentiation among current domesticated chicken breeds and the Red Jungle Fowl. We expect that, because of their high abundance, SVs might explain phenotypic differences and play a role in the evolution of the chicken genome. Finally, our study exemplifies an efficient and cost-effective approach for identifying structural variation in sequenced genomes.

  8. Crystal and molecular structure of a second, high-density polymorph of silver malonate

    Science.gov (United States)

    Prakasha Reddy, J.; Foxman, Bruce M.

    2008-11-01

    A new orthorhombic polymorph of silver malonate ( II) has been synthesized and structurally characterized. Compound II crystallizes in space group Pnma, with a = 12.8180(11), b = 9.2479(8), c = 4.0219(3) Å; V = 476.75(7) Å 3; Z = 4; ρcalc = 4.427 g cm -3. Full-matrix least-squares refinement converged to R1 = 0.0099 ( I > 2 σ( I), 765 data) and w R2 = 0.0264 ( F2, 785 data, 51 parameters). The familiar eight-membered Ag 2(RCO 2) 2 ring, characteristic of most silver(I) carboxylate complexes, has a Ag-Ag distance of 2.977(1) Å. Puckered sheets in the crystal ab plane are connected along c to form a three-dimensional coordination polymer. The density of II is >30% higher than that of the monoclinic polymorph ( I), first characterized in 1981. A survey of the density ratios of polymorphs in the Cambridge Structural Database indicates that virtually all of the pairs of entries having a density ratio >1.2 involve at least one form with significant void volume, as estimated using PLATON. However, neither I nor II have significant void volume, which suggests that the silver malonate polymorphs represent one of the largest density differences observed to date for a polymorphic pair. To date, it has not been possible to prepare I in our laboratories.

  9. Whole Genome and Core Genome Multilocus Sequence Typing and Single Nucleotide Polymorphism Analyses of Listeria monocytogenes Isolates Associated with an Outbreak Linked to Cheese, United States, 2013

    Science.gov (United States)

    Luo, Yan; Carleton, Heather; Timme, Ruth; Melka, David; Muruvanda, Tim; Wang, Charles; Kastanis, George; Katz, Lee S.; Turner, Lauren; Fritzinger, Angela; Moore, Terence; Stones, Robert; Blankenship, Joseph; Salter, Monique; Parish, Mickey; Hammack, Thomas S.; Evans, Peter S.; Tarr, Cheryl L.; Allard, Marc W.; Strain, Errol A.; Brown, Eric W.

    2017-01-01

    ABSTRACT Epidemiological findings of a listeriosis outbreak in 2013 implicated Hispanic-style cheese produced by company A, and pulsed-field gel electrophoresis (PFGE) and whole genome sequencing (WGS) were performed on clinical isolates and representative isolates collected from company A cheese and environmental samples during the investigation. The results strengthened the evidence for cheese as the vehicle. Surveillance sampling and WGS 3 months later revealed that the equipment purchased by company B from company A yielded an environmental isolate highly similar to all outbreak isolates. The whole genome and core genome multilocus sequence typing and single nucleotide polymorphism (SNP) analyses results were compared to demonstrate the maximum discriminatory power obtained by using multiple analyses, which were needed to differentiate outbreak-associated isolates from a PFGE-indistinguishable isolate collected in a nonimplicated food source in 2012. This unrelated isolate differed from the outbreak isolates by only 7 to 14 SNPs, and as a result, the minimum spanning tree from the whole genome analyses and certain variant calling approach and phylogenetic algorithm for core genome-based analyses could not provide differentiation between unrelated isolates. Our data also suggest that SNP/allele counts should always be combined with WGS clustering analysis generated by phylogenetically meaningful algorithms on a sufficient number of isolates, and the SNP/allele threshold alone does not provide sufficient evidence to delineate an outbreak. The putative prophages were conserved across all the outbreak isolates. All outbreak isolates belonged to clonal complex 5 and serotype 1/2b and had an identical inlA sequence which did not have premature stop codons. IMPORTANCE In this outbreak, multiple analytical approaches were used for maximum discriminatory power. A PFGE-matched, epidemiologically unrelated isolate had high genetic similarity to the outbreak

  10. Structural genomics of infectious disease drug targets: the SSGCID

    International Nuclear Information System (INIS)

    Stacy, Robin; Begley, Darren W.; Phan, Isabelle; Staker, Bart L.; Van Voorhis, Wesley C.; Varani, Gabriele; Buchko, Garry W.; Stewart, Lance J.; Myler, Peter J.

    2011-01-01

    An introduction and overview of the focus, goals and overall mission of the Seattle Structural Genomics Center for Infectious Disease (SSGCID) is given. The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a consortium of researchers at Seattle BioMed, Emerald BioStructures, the University of Washington and Pacific Northwest National Laboratory that was established to apply structural genomics approaches to drug targets from infectious disease organisms. The SSGCID is currently funded over a five-year period by the National Institute of Allergy and Infectious Diseases (NIAID) to determine the three-dimensional structures of 400 proteins from a variety of Category A, B and C pathogens. Target selection engages the infectious disease research and drug-therapy communities to identify drug targets, essential enzymes, virulence factors and vaccine candidates of biomedical relevance to combat infectious diseases. The protein-expression systems, purified proteins, ligand screens and three-dimensional structures produced by SSGCID constitute a valuable resource for drug-discovery research, all of which is made freely available to the greater scientific community. This issue of Acta Crystallographica Section F, entirely devoted to the work of the SSGCID, covers the details of the high-throughput pipeline and presents a series of structures from a broad array of pathogenic organisms. Here, a background is provided on the structural genomics of infectious disease, the essential components of the SSGCID pipeline are discussed and a survey of progress to date is presented

  11. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle.

    Science.gov (United States)

    Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S

    2014-11-01

    Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  12. Structural Genomics and Drug Discovery for Infectious Diseases

    International Nuclear Information System (INIS)

    Anderson, W.F.

    2009-01-01

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  13. Structural Genomics and Drug Discovery for Infectious Diseases

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, W.F.

    2010-09-03

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  14. Polymorphisms in AHI1 are not associated with type 2 diabetes or related phenotypes in Danes: non-replication of a genome-wide association result

    DEFF Research Database (Denmark)

    Holmkvist, J; Anthonsen, S; Wegner, L

    2008-01-01

    AIMS/HYPOTHESIS: A genome-wide association study recently identified an association between common variants, rs1535435 and rs9494266, in the AHI1 gene and type 2 diabetes. The aim of the present study was to investigate the putative association between these polymorphisms and type 2 diabetes or t...

  15. Identification of a novel FGFRL1 MicroRNA target site polymorphism for bone mineral density in meta-analyses of genome-wide association studies

    NARCIS (Netherlands)

    T. Niu (Tianhua); N. Liu (Ning); M. Zhao (Ming); G. Xie (Guie); L. Zhang (Lei); J. Li (Jian); Y.-F. Pei (Yu-Fang); H. Shen (Hui); X. Fu (Xiaoying); H. He (Hao); S. Lu (Shan); X. Chen (Xiangding); L. Tan (Lijun); T.-L. Yang (Tie-Lin); Y. Guo (Yan); P.J. Leo (Paul); E.L. Duncan (Emma); J. Shen (Jie); Y.-F. Guo (Yan-fang); G.C. Nicholson (Geoffrey); R.L. Prince (Richard L.); J.A. Eisman (John); G. Jones (Graeme); P.N. Sambrook (Philip); X. Hu (Xiang); P.M. Das (Partha M.); Q. Tian (Qing); X.-Z. Zhu (Xue-Zhen); C.J. Papasian (Christopher J.); M.A. Brown (Matthew); A.G. Uitterlinden (André); Y.-P. Wang (Yu-Ping); S. Xiang (Shuanglin); H.-W. Deng

    2015-01-01

    textabstractMicroRNAs (miRNAs) are critical post-transcriptional regulators. Based on a previous genome-wide association (GWA) scan, we conducted a polymorphism in microRNAs' Target Sites (poly-miRTS)-centric multistage meta-analysis for lumbar spine (LS)-, total hip (HIP)-, and femoral neck

  16. Structural determinants and mechanism of HIV-1 genome packaging.

    Science.gov (United States)

    Lu, Kun; Heng, Xiao; Summers, Michael F

    2011-07-22

    Like all retroviruses, the human immunodeficiency virus selectively packages two copies of its unspliced RNA genome, both of which are utilized for strand-transfer-mediated recombination during reverse transcription-a process that enables rapid evolution under environmental and chemotherapeutic pressures. The viral RNA appears to be selected for packaging as a dimer, and there is evidence that dimerization and packaging are mechanistically coupled. Both processes are mediated by interactions between the nucleocapsid domains of a small number of assembling viral Gag polyproteins and RNA elements within the 5'-untranslated region of the genome. A number of secondary structures have been predicted for regions of the genome that are responsible for packaging, and high-resolution structures have been determined for a few small RNA fragments and protein-RNA complexes. However, major questions regarding the RNA structures (and potentially the structural changes) that are responsible for dimeric genome selection remain unanswered. Here, we review efforts that have been made to identify the molecular determinants and mechanism of human immunodeficiency virus type 1 genome packaging. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. The Impact of Structural Genomics: Expectations and Outcomes

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2005-12-21

    Structural Genomics (SG) projects aim to expand our structural knowledge of biological macromolecules, while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and contrast these results with traditional structural biology. The first structure from a protein family is particularly important to reveal the fold and ancient relationships to other proteins. In the last year, approximately half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient U.S. center has now dropped to one-quarter the estimated cost of solving a structure by traditional methods. However, top structural biology laboratories are much more efficient than the average, and comparable to SG centers despite working on very challenging structures. Moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

  18. Polymorphism, Intermolecular Interactions, and Spectroscopic Properties in Crystal Structures of Sulfonamides.

    Science.gov (United States)

    Sainz-Díaz, C Ignacio; Francisco-Márquez, Misaela; Soriano-Correa, Catalina

    2018-01-01

    The antibiotics family of sulfonamides has been used worldwide intensively in human therapeutics and farm livestock during decades. Intermolecular interactions of these sulfamides are important to understand their bioactivity and biodegradation. These interactions are also responsible for their supramolecular structures. The intermolecular interactions in the crystal polymorphs of the sulfonamides, sulfamethoxypyridazine, and sulfamethoxydiazine, as models of sulfonamides, have been studied by using quantum mechanical calculations. Different conformations in the sulphonamide molecules have been detected in the crystal polymorphs. Several intermolecular patterns have been studied to understand the molecular packing behavior in these antibiotics. Strong intermolecular hydrogen bonds and π-π interactions are the main driving forces for crystal packing in these sulfonamides. Different stability between polymorphs can explain the experimental behavior of these crystal forms. The calculated infrared spectroscopy frequencies explain the main intermolecular interactions in these crystals. Copyright © 2018 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  19. Multi-scale structural community organisation of the human genome.

    Science.gov (United States)

    Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

    2017-04-11

    Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

  20. Development and validation of 697 novel polymorphic genomic and EST-SSR markers in the American cranberry (Vaccinium macrocarpon Ait.).

    Science.gov (United States)

    Schlautman, Brandon; Fajardo, Diego; Bougie, Tierney; Wiesman, Eric; Polashock, James; Vorsa, Nicholi; Steffan, Shawn; Zalapa, Juan

    2015-01-27

    The American cranberry, Vaccinium macrocarpon Ait., is an economically important North American fruit crop that is consumed because of its unique flavor and potential health benefits. However, a lack of abundant, genome-wide molecular markers has limited the adoption of modern molecular assisted selection approaches in cranberry breeding programs. To increase the number of available markers in the species, this study identified, tested, and validated microsatellite markers from existing nuclear and transcriptome sequencing data. In total, new primers were designed, synthesized, and tested for 979 SSR loci; 697 of the markers amplified allele patterns consistent with single locus segregation in a diploid organism and were considered polymorphic. Of the 697 polymorphic loci, 507 were selected for additional genetic diversity and segregation analyses in 29 cranberry genotypes. More than 95% of the 507 loci did not display segregation distortion at the p 0.25. This comprehensive collection of developed and validated microsatellite loci represents a substantial addition to the molecular tools available for geneticists, genomicists, and breeders in cranberry and Vaccinium.

  1. Genomic variations of Mycoplasma capricolum subsp capripneumoniae detected by amplified fragment length polymorphism (AFLP) analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Bolske, G.; Ahrens, Peter

    2000-01-01

    The genetic diversity of Mycoplasma capricolum subsp. capripneumoniae strains based on determination of amplified fragment length polymorphisms (AFLP) is described. AFLP fingerprints of 38 strains derived from different countries in Africa and the Middle East consisted of over 100 bands in the size...... found by 16S rDNA analysis. The present data support previous observations regarding genetic homogeneity of M. capricolum subsp. capripneumoniae, and confirm the two evolutionary lines of descent found by analysis of 16S rRNA genes....

  2. Host genome polymorphisms and tuberculosis infection: What we have to say?

    Directory of Open Access Journals (Sweden)

    Said Alfin Khalilullah

    2014-01-01

    Full Text Available Several epidemiology studies suggest that host genetic factors play important roles in susceptibility, protection and progression of tuberculosis infection. Here we have reviewed the implications of some genetic polymorphisms in pathways related to tuberculosis susceptibility, severity and development. Large case-control studies examining single-nucleotide polymorphisms (SNPs in genes have been performed in tuberculosis patients in some countries. Polymorphisms in natural resistance-associated macrophage protein 1 (NRAMP1, toll-like receptor 2 (TLR2, interleukin-6 (IL-6, tumor necrosis factor alpha (TNF-α, interleukin-1 receptor antagonist (IL-1RA, IL-10, vitamin D receptor (VDR, dendritic cell-specific ICAM-3-grabbing non-integrin (DC-SIGN, monocyte chemoattractant protein-1 (MCP-1, nucleotide oligomerization binding domain 2 (NOD2, interferon-gamma (IFN-γ, inducible nitric oxide synthase (iNOS, mannose-binding lectin (MBL and surfactant proteins A (SP-A have been reviewed. These genes have been variably associated with tuberculosis infection and there is strong evidence indicating that host genetic factors play critical roles in tuberculosis susceptibility, severity and development.

  3. Evolutionary genomics and population structure of Entamoeba histolytica

    Directory of Open Access Journals (Sweden)

    Koushik Das

    2014-11-01

    Full Text Available Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba.

  4. Structural Genomics of Bacterial Virulence Factors

    Science.gov (United States)

    2006-05-01

    membrane-inserted PA pore. The model is based on the pre-pore PA63 crystal structure, channel conductance studies, and the crystal structure of α... Cyanobacteria BXA0032 and BXA0033 (pXO1-22), if fused, would belong to the COG0175 family, members of the 3’- phosphoadenosine 5’-phosphosulfate...and thiol sulfur atom directed toward the zinc. For the LF(E687C)–GM6001–Zn2+ complex (Fig. 2c–e), where LF(E687C) represents the LF E687C mutant, the

  5. Genome-wide single nucleotide polymorphisms (SNPs) for a model invasive ascidian Botryllus schlosseri.

    Science.gov (United States)

    Gao, Yangchun; Li, Shiguo; Zhan, Aibin

    2018-04-01

    Invasive species cause huge damages to ecology, environment and economy globally. The comprehensive understanding of invasion mechanisms, particularly genetic bases of micro-evolutionary processes responsible for invasion success, is essential for reducing potential damages caused by invasive species. The golden star tunicate, Botryllus schlosseri, has become a model species in invasion biology, mainly owing to its high invasiveness nature and small well-sequenced genome. However, the genome-wide genetic markers have not been well developed in this highly invasive species, thus limiting the comprehensive understanding of genetic mechanisms of invasion success. Using restriction site-associated DNA (RAD) tag sequencing, here we developed a high-quality resource of 14,119 out of 158,821 SNPs for B. schlosseri. These SNPs were relatively evenly distributed at each chromosome. SNP annotations showed that the majority of SNPs (63.20%) were located at intergenic regions, and 21.51% and 14.58% were located at introns and exons, respectively. In addition, the potential use of the developed SNPs for population genomics studies was primarily assessed, such as the estimate of observed heterozygosity (H O ), expected heterozygosity (H E ), nucleotide diversity (π), Wright's inbreeding coefficient (F IS ) and effective population size (Ne). Our developed SNP resource would provide future studies the genome-wide genetic markers for genetic and genomic investigations, such as genetic bases of micro-evolutionary processes responsible for invasion success.

  6. GW-SEM: A Statistical Package to Conduct Genome-Wide Structural Equation Modeling.

    Science.gov (United States)

    Verhulst, Brad; Maes, Hermine H; Neale, Michael C

    2017-05-01

    Improving the accuracy of phenotyping through the use of advanced psychometric tools will increase the power to find significant associations with genetic variants and expand the range of possible hypotheses that can be tested on a genome-wide scale. Multivariate methods, such as structural equation modeling (SEM), are valuable in the phenotypic analysis of psychiatric and substance use phenotypes, but these methods have not been integrated into standard genome-wide association analyses because fitting a SEM at each single nucleotide polymorphism (SNP) along the genome was hitherto considered to be too computationally demanding. By developing a method that can efficiently fit SEMs, it is possible to expand the set of models that can be tested. This is particularly necessary in psychiatric and behavioral genetics, where the statistical methods are often handicapped by phenotypes with large components of stochastic variance. Due to the enormous amount of data that genome-wide scans produce, the statistical methods used to analyze the data are relatively elementary and do not directly correspond with the rich theoretical development, and lack the potential to test more complex hypotheses about the measurement of, and interaction between, comorbid traits. In this paper, we present a method to test the association of a SNP with multiple phenotypes or a latent construct on a genome-wide basis using a diagonally weighted least squares (DWLS) estimator for four common SEMs: a one-factor model, a one-factor residuals model, a two-factor model, and a latent growth model. We demonstrate that the DWLS parameters and p-values strongly correspond with the more traditional full information maximum likelihood parameters and p-values. We also present the timing of simulations and power analyses and a comparison with and existing multivariate GWAS software package.

  7. Decoding the fine-scale structure of a breast cancer genome and transcriptome

    OpenAIRE

    Volik, Stanislav; Raphael, Benjamin J.; Huang, Guiqing; Stratton, Michael R.; Bignel, Graham; Murnane, John; Brebner, John H.; Bajsarowicz, Krystyna; Paris, Pamela L.; Tao, Quanzhou; Kowbel, David; Lapuk, Anna; Shagin, Dmitri A.; Shagina, Irina A.; Gray, Joe W.

    2006-01-01

    A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome rearrangements elucidating the structural organization of tumor genomes. Here we exte...

  8. Genome-wide association study for rotator cuff tears identifies two significant single-nucleotide polymorphisms.

    Science.gov (United States)

    Tashjian, Robert Z; Granger, Erin K; Farnham, James M; Cannon-Albright, Lisa A; Teerlink, Craig C

    2016-02-01

    The precise etiology of rotator cuff disease is unknown, but prior evidence suggests a role for genetic factors. Limited data exist identifying specific genes associated with rotator cuff tearing. The purpose of this study was to identify specific genes or genetic variants associated with rotator cuff tearing by a genome-wide association study with an independent set of rotator cuff tear cases. A set of 311 full-thickness rotator cuff tear cases genotyped on the Illumina 5M single-nucleotide polymorphism (SNP) platform were used in a genome-wide association study with 2641 genetically matched white population controls available from the Illumina iControls database. Tests of association were performed with GEMMA software at 257,558 SNPs that compose the intersection of Illumina SNP platforms and that passed general quality control metrics. SNPs were considered significant if P development of rotator cuff tearing. Copyright © 2016 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.

  9. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob Hull

    2014-01-01

    for Laurasiatheria (pig, cow, dolphin, horse, cat, dog, hedgehog). CONCLUSIONS: We have obtained one of the most comprehensive annotations for structured ncRNAs of a mammalian genome, which is likely to play central roles in both health modelling and production. The core annotation is available in Ensembl 70...

  10. Structure and sequence motifs in the HIV-1 RNA genome

    NARCIS (Netherlands)

    van Bel, N.

    2015-01-01

    The untranslated leader of the HIV-1 RNA genome contains some 350 nucleotides and is highly conserved among virus isolates. Several characteristic hairpin structures that regulate important virus replication steps, such as dimerization and packaging in virion particles, are clustered in this leader.

  11. cDNA structure, genomic organization and expression patterns of ...

    African Journals Online (AJOL)

    Visfatin was a newly identified adipocytokine, which was involved in various physiologic and pathologic processes of organisms. The cDNA structure, genomic organization and expression patterns of silver Prussian carp visfatin were described in this report. The silver Prussian carp visfatin cDNA cloned from the liver was ...

  12. cDNA structure, genomic organization and expression patterns of ...

    African Journals Online (AJOL)

    use

    2011-11-23

    Nov 23, 2011 ... Visfatin was a newly identified adipocytokine, which was involved in various physiologic and pathologic processes of organisms. The cDNA structure, genomic organization and expression patterns of silver Prussian carp visfatin were described in this report. The silver Prussian carp visfatin. cDNA cloned ...

  13. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob H

    2014-01-01

    BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However......, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure...... lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome...

  14. Amyloid structure exhibits polymorphism on multiple length scales in human brain tissue

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Jiliang; Costantino, Isabel; Venugopalan, Nagarajan; Fischetti, Robert F.; Hyman, Bradley; Frosch, Matthew; Gomez-Isla, Teresa; Makowski, Lee

    2016-09-15

    Although aggregation of Aβ amyloid fibrils into plaques in the brain is a hallmark of Alzheimer's Disease (AD), the correlation between amyloid burden and severity of symptoms is weak. One possible reason is that amyloid fibrils are structurally polymorphic and different polymorphs may contribute differentially to disease. However, the occurrence and distribution of amyloid polymorphisms in human brain is poorly documented. Here we seek to fill this knowledge gap by using X-ray microdiffraction of histological sections of human tissue to map the abundance, orientation and structural heterogeneities of amyloid within individual plaques; among proximal plaques and in subjects with distinct clinical histories. A 5 µ x-ray beam was used to generate diffraction data with each pattern arising from a scattering volume of only ~ 450 µ3 , making possible collection of dozens to hundreds of diffraction patterns from a single amyloid plaque. X-ray scattering from these samples exhibited all the properties expected for scattering from amyloid. Amyloid distribution was mapped using the intensity of its signature 4.7 Å reflection which also provided information on the orientation of amyloid fibrils across plaques. Margins of plaques exhibited a greater degree of orientation than cores and orientation around blood vessels frequently appeared tangential. Variation in the structure of Aβ fibrils is reflected in the shape of the 4.7 Å peak which usually appears as a doublet. Variations in this peak correspond to differences between the structure of amyloid within cores of plaques and at their periphery. Examination of tissue from a mismatch case - an individual with high plaque burden but no overt signs of dementia at time of death - revealed a diversity of structure and spatial distribution of amyloid that is distinct from typical AD cases. We demonstrate the existence of structural polymorphisms among amyloid within and among plaques of a single individual and suggest

  15. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  16. Evolution of human IgH3'EC duplicated structures: both enhancers HS1,2 are polymorphic with variation of transcription factor's consensus sites.

    Science.gov (United States)

    Giambra, Vincenzo; Fruscalzo, Alberto; Giufre', Maria; Martinez-Labarga, Cristina; Favaro, Marco; Rocchi, Mariano; Frezza, Domenico

    2005-02-14

    The enhancer complex regulatory region at the 3' of the immunoglobulin heavy cluster (IgH3'EC) is duplicated in apes along with four constant genes and the region is highly conserved throughout humans. Both human IgH3'ECs consist of three loci high sensitive (HS) to DNAse I with enhancer activity. It is thus possible that the presence of structural divergences between the two IgH3'ECs and of relative polymorphisms correspond to functional regulatory changes. To analyse the polymorphisms of these almost identical regions, it resulted mandatory to identify the presence of divergent sequences, in order to select distinctive primers for specific PCR genomic amplifications. To this aim, we first compared the two entire IgH3'ECs in silicio, utilising the updated GenBank (GB) contigs, then we analysed the two IgH3'ECs by cloning and sequencing amplicons from independent genomes. In silicio analysis showed that several inversions, deletions and short insertions had occurred after the duplication. We analysed in detail, by sequencing specific regions, the polymorphisms occurring in enhancer HS1,2-A (which lies in IgH3'EC-1, 3' to the Calpha-1 gene) and in enhancer HS1,2-B (which lies in IgH3'EC-2, 3' to Calpha-2). Polymorphisms are due to the repetition (occurring one to four times) of a 38-bp sequence present at the 3' of the core of enhancers HS1,2. The structure of both human HS1,2 enhancers has revealed not yet described polymorphic features due to the presence of variable spacer elements separating the 38-bp repetitions and to variable external elements bordering the repetition cluster. We found that one of the external elements gave rise to a divergent allele 3 in the two clusters. The frequency of the different alleles of the two loci varies in the Italian population and allele 3 of both loci are very rare. The analysis of the Callicebus moloch, Gorilla gorilla and Pan troglodytes HS1,2 enhancers showed the transformation from the ancestral structure with the 31- to

  17. Use of microsatellite markers derived from whole genome sequence data for identifying polymorphism in Phytophthora ramorum

    Science.gov (United States)

    Kelly Ivors; Matteo Garbelotto; Ineke De Vries; Peter Bonants

    2006-01-01

    Investigating the population genetics of Phytophthora ramorum, the causal agent of sudden oak death (SOD), is critical to understanding the biology and epidemiology of this important phytopathogen. Raw sequence data (445,000 reads) of P. ramorum was provided by the Joint Genome Institute. Our objective was to develop and utilize...

  18. Genome-wide population structure and evolutionary history of the Frizarta dairy sheep.

    Science.gov (United States)

    Kominakis, A; Hager-Theodorides, A L; Saridaki, A; Antonakos, G; Tsiamis, G

    2017-10-01

    In the present study, we used genomic data, generated with a medium density single nucleotide polymorphisms (SNP) array, to acquire more information on the population structure and evolutionary history of the synthetic Frizarta dairy sheep. First, two typical measures of linkage disequilibrium (LD) were estimated at various physical distances that were then used to make inferences on the effective population size at key past time points. Population structure was also assessed by both multidimensional scaling analysis and k-means clustering on the distance matrix obtained from the animals' genomic relationships. The Wright's fixation F ST index was also employed to assess herds' genetic homogeneity and to indirectly estimate past migration rates. The Wright's fixation F IS index and genomic inbreeding coefficients based on the genomic relationship matrix as well as on runs of homozygosity were also estimated. The Frizarta breed displays relatively low LD levels with r 2 and |D'| equal to 0.18 and 0.50, respectively, at an average inter-marker distance of 31 kb. Linkage disequilibrium decayed rapidly by distance and persisted over just a few thousand base pairs. Rate of LD decay (β) varied widely among the 26 autosomes with larger values estimated for shorter chromosomes (e.g. β=0.057, for OAR6) and smaller values for longer ones (e.g. β=0.022, for OAR2). The inferred effective population size at the beginning of the breed's formation was as high as 549, was then reduced to 463 in 1981 (end of the breed's formation) and further declined to 187, one generation ago. Multidimensional scaling analysis and k-means clustering suggested a genetically homogenous population, F ST estimates indicated relatively low genetic differentiation between herds, whereas a heat map of the animals' genomic kinship relationships revealed a stratified population, at a herd level. Estimates of genomic inbreeding coefficients suggested that most recent parental relatedness may have been a

  19. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Directory of Open Access Journals (Sweden)

    Yash Paul Khajuria

    Full Text Available The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777 of an inter-specific reference mapping population. High amplification efficiency (87%, experimental validation success rate (81% and polymorphic potential (55% of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48% detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%. An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777 having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped

  20. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Science.gov (United States)

    Khajuria, Yash Paul; Saxena, Maneesha S; Gaur, Rashmi; Chattopadhyay, Debasis; Jain, Mukesh; Parida, Swarup K; Bhatia, Sabhyata

    2015-01-01

    The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777) of an inter-specific reference mapping population. High amplification efficiency (87%), experimental validation success rate (81%) and polymorphic potential (55%) of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48%) detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%). An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777) having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs) of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM) longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped highest

  1. Structural study of piracetam polymorphs and cocrystals: crystallography redetermination and quantum mechanics calculations.

    Science.gov (United States)

    Tilborg, Anaëlle; Jacquemin, Denis; Norberg, Bernadette; Perpète, Eric; Michaux, Catherine; Wouters, Johan

    2011-12-01

    Pharmaceutical compounds are mostly developed as solid dosage forms containing a single-crystal form. It means that the selection of a particular crystal state for a given molecule is an important step for further clinical outlooks. In this context, piracetam, a pharmaceutical molecule known since the sixties for its nootropic properties, is considered in the present work. This molecule is analyzed using several experimental and theoretical approaches. First, the conformational space of the molecule has been systematically explored by performing a quantum mechanics scan of the two most relevant dihedral angles of the lateral chain. The predicted stable conformations have been compared to all the reported experimental geometries retrieved from the Cambridge Structural Database (CSD) covering polymorphs and cocrystals structures. In parallel, different batches of powders have been recrystallized. Under specific conditions, single crystals of polymorph (III) of piracetam have been obtained, an outcome confirmed by crystallographic analysis. © 2011 International Union of Crystallography. Printed in Singapore – all rights reserved.

  2. Crystal structure of a new polymorph of di(thiophen-3-yl ketone

    Directory of Open Access Journals (Sweden)

    Jörg Hübscher

    2017-10-01

    Full Text Available The crystal structure of the title compound, C9H6OS2, represents a new polymorph. The crystal structure was solved in the orthorhombic space group Pbcn with one half of the molecule in the asymmetric unit. The thiophene rings are perfectly planar and twisted with respect to each other, showing the molecule to be in an S,O-trans/S,O-trans conformation. In the crystal, C—H...O hydrogen bonds connect the molecules into layers extending parallel to the ab plane. The crystal structure also features π–π interactions.

  3. Crystal structure of a new polymorph of di(thio-phen-3-yl) ketone.

    Science.gov (United States)

    Hübscher, Jörg; Augustin, André U; Seichter, Wilhelm; Weber, Edwin

    2017-10-01

    The crystal structure of the title compound, C 9 H 6 OS 2 , represents a new polymorph. The crystal structure was solved in the ortho-rhom-bic space group Pbcn with one half of the mol-ecule in the asymmetric unit. The thio-phene rings are perfectly planar and twisted with respect to each other, showing the mol-ecule to be in an S,O- trans /S,O- trans conformation. In the crystal, C-H⋯O hydrogen bonds connect the mol-ecules into layers extending parallel to the ab plane. The crystal structure also features π-π inter-actions.

  4. Structure-Based Alignment and Consensus Secondary Structures for Three HIV-Related RNA Genomes.

    Science.gov (United States)

    Lavender, Christopher A; Gorelick, Robert J; Weeks, Kevin M

    2015-05-01

    HIV and related primate lentiviruses possess single-stranded RNA genomes. Multiple regions of these genomes participate in critical steps in the viral replication cycle, and the functions of many RNA elements are dependent on the formation of defined structures. The structures of these elements are still not fully understood, and additional functional elements likely exist that have not been identified. In this work, we compared three full-length HIV-related viral genomes: HIV-1NL4-3, SIVcpz, and SIVmac (the latter two strains are progenitors for all HIV-1 and HIV-2 strains, respectively). Model-free RNA structure comparisons were performed using whole-genome structure information experimentally derived from nucleotide-resolution SHAPE reactivities. Consensus secondary structures were constructed for strongly correlated regions by taking into account both SHAPE probing structural data and nucleotide covariation information from structure-based alignments. In these consensus models, all known functional RNA elements were recapitulated with high accuracy. In addition, we identified multiple previously unannotated structural elements in the HIV-1 genome likely to function in translation, splicing and other replication cycle processes; these are compelling targets for future functional analyses. The structure-informed alignment strategy developed here will be broadly useful for efficient RNA motif discovery.

  5. Genetic linkage map of a wild genome: genomic structure, recombination and sexual dimorphism in bighorn sheep

    Science.gov (United States)

    2010-01-01

    Background The construction of genetic linkage maps in free-living populations is a promising tool for the study of evolution. However, such maps are rare because it is difficult to develop both wild pedigrees and corresponding sets of molecular markers that are sufficiently large. We took advantage of two long-term field studies of pedigreed individuals and genomic resources originally developed for domestic sheep (Ovis aries) to construct a linkage map for bighorn sheep, Ovis canadensis. We then assessed variability in genomic structure and recombination rates between bighorn sheep populations and sheep species. Results Bighorn sheep population-specific maps differed slightly in contiguity but were otherwise very similar in terms of genomic structure and recombination rates. The joint analysis of the two pedigrees resulted in a highly contiguous map composed of 247 microsatellite markers distributed along all 26 autosomes and the X chromosome. The map is estimated to cover about 84% of the bighorn sheep genome and contains 240 unique positions spanning a sex-averaged distance of 3051 cM with an average inter-marker distance of 14.3 cM. Marker synteny, order, sex-averaged interval lengths and sex-averaged total map lengths were all very similar between sheep species. However, in contrast to domestic sheep, but consistent with the usual pattern for a placental mammal, recombination rates in bighorn sheep were significantly greater in females than in males (~12% difference), resulting in an autosomal female map of 3166 cM and an autosomal male map of 2831 cM. Despite differing genome-wide patterns of heterochiasmy between the sheep species, sexual dimorphism in recombination rates was correlated between orthologous intervals. Conclusions We have developed a first-generation bighorn sheep linkage map that will facilitate future studies of the genetic architecture of trait variation in this species. While domestication has been hypothesized to be responsible for the

  6. Chromatin structure and evolution in the human genome

    Directory of Open Access Journals (Sweden)

    Dunlop Malcolm G

    2007-05-01

    Full Text Available Abstract Background Evolutionary rates are not constant across the human genome but genes in close proximity have been shown to experience similar levels of divergence and selection. The higher-order organisation of chromosomes has often been invoked to explain such phenomena but previously there has been insufficient data on chromosome structure to investigate this rigorously. Using the results of a recent genome-wide analysis of open and closed human chromatin structures we have investigated the global association between divergence, selection and chromatin structure for the first time. Results In this study we have shown that, paradoxically, synonymous site divergence (dS at non-CpG sites is highest in regions of open chromatin, primarily as a result of an increased number of transitions, while the rates of other traditional measures of mutation (intergenic, intronic and ancient repeat divergence as well as SNP density are highest in closed regions of the genome. Analysis of human-chimpanzee divergence across intron-exon boundaries indicates that although genes in relatively open chromatin generally display little selection at their synonymous sites, those in closed regions show markedly lower divergence at their fourfold degenerate sites than in neighbouring introns and intergenic regions. Exclusion of known Exonic Splice Enhancer hexamers has little affect on the divergence observed at fourfold degenerate sites across chromatin categories; however, we show that closed chromatin is enriched with certain classes of ncRNA genes whose RNA secondary structure may be particularly important. Conclusion We conclude that, overall, non-CpG mutation rates are lowest in open regions of the genome and that regions of the genome with a closed chromatin structure have the highest background mutation rate. This might reflect lower rates of DNA damage or enhanced DNA repair processes in regions of open chromatin. Our results also indicate that dS is a poor

  7. Genomic Fingerprinting of the Vaccine Strain of Clostridium Tetani by Restriction Fragment Length Polymorphism Technique

    Directory of Open Access Journals (Sweden)

    Naser Harzandi

    2013-05-01

    Full Text Available Background: Clostridium tetani or Nicolaier’s bacillus is an obligatory anaerobic, Gram-positive, movable with terminal or sub terminal spore. The chromosome of C. tetani contains 2,799,250 bp with a G+C content of 28.6%. The aim of this study was identification and genomic fingerprinting of the vaccine strain of C. tetani.Materials and Methods: The vaccine strain of C. tetani was provided by Razi Vaccine and Serum Research Institute. The seeds were inoculated into Columbia blood agar and grown for 72 h and transferred to the thioglycolate broth medium for further 36 h culturing. The cultures were incubated at 35ºC in anaerobic conditions. DNA extraction with phenol/ chloroform method was performed. After extraction, the consistency of DNA was assayed. Next, the vaccine strain was digested using pvuII enzyme and incubated at 37ºC for overnight. The digested DNA was gel-electrophoresed by 1% agarose for a short time. Then, the gel was studied with Gel Doc system and transferred to Hybond N+membrane using standard DNA blotting techniques.Results: The vaccine strain of C. tetani genome was fingerprinted by RFLP technique. Our preliminary results showed no divergence exists in the vaccine strain used for the production tetanus toxoid during the periods of 1990-2011.Conclusion: Observation suggests that there is lack of significant changes in RFLP genomic fingerprinting profile of the vaccine strain. Therefore, this strain did not lose its efficiency in tetanus vaccine production. RFLP analysis is worthwhile in investigating the nature of the vaccine strain C. tetani.

  8. Supplementary Material for: Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-01-01

    Abstract Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel

  9. Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-03-23

    Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel resistance

  10. Genome-wide analysis of synonymous single nucleotide polymorphisms in Mycobacterium tuberculosis complex organisms: resolution of genetic relationships among closely related microbial strains.

    Science.gov (United States)

    Gutacker, Michaela M; Smoot, James C; Migliaccio, Cristi A Lux; Ricklefs, Stacy M; Hua, Su; Cousins, Debby V; Graviss, Edward A; Shashkina, Elena; Kreiswirth, Barry N; Musser, James M

    2002-12-01

    Several human pathogens (e.g., Bacillus anthracis, Yersinia pestis, Bordetella pertussis, Plasmodium falciparum, and Mycobacterium tuberculosis) have very restricted unselected allelic variation in structural genes, which hinders study of the genetic relationships among strains and strain-trait correlations. To address this problem in a representative pathogen, 432 M. tuberculosis complex strains from global sources were genotyped on the basis of 230 synonymous (silent) single nucleotide polymorphisms (sSNPs) identified by comparison of four genome sequences. Eight major clusters of related genotypes were identified in M. tuberculosis sensu stricto, including a single cluster representing organisms responsible for several large outbreaks in the United States and Asia. All M. tuberculosis sensu stricto isolates of previously unknown phylogenetic position could be rapidly and unambiguously assigned to one of the eight major clusters, thus providing a facile strategy for identifying organisms that are clonally related by descent. Common clones of M. tuberculosis sensu stricto and M. bovis are distinct, deeply branching genotypic complexes whose extant members did not emerge directly from one another in the recent past. sSNP genotyping rapidly delineates relationships among closely related strains of pathogenic microbes and allows construction of genetic frameworks for examining the distribution of biomedically relevant traits such as virulence, transmissibility, and host range.

  11. Elucidation of Operon Structures across Closely Related Bacterial Genomes

    Science.gov (United States)

    Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components. PMID:24959722

  12. Elucidation of operon structures across closely related bacterial genomes.

    Science.gov (United States)

    Zhou, Chuan; Ma, Qin; Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.

  13. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  14. Photochemical water oxidation by crystalline polymorphs of manganese oxides: structural requirements for catalysis.

    Science.gov (United States)

    Robinson, David M; Go, Yong Bok; Mui, Michelle; Gardner, Graeme; Zhang, Zhijuan; Mastrogiovanni, Daniel; Garfunkel, Eric; Li, Jing; Greenblatt, Martha; Dismukes, G Charles

    2013-03-06

    Manganese oxides occur naturally as minerals in at least 30 different crystal structures, providing a rigorous test system to explore the significance of atomic positions on the catalytic efficiency of water oxidation. In this study, we chose to systematically compare eight synthetic oxide structures containing Mn(III) and Mn(IV) only, with particular emphasis on the five known structural polymorphs of MnO2. We have adapted literature synthesis methods to obtain pure polymorphs and validated their homogeneity and crystallinity by powder X-ray diffraction and both transmission and scanning electron microscopies. Measurement of water oxidation rate by oxygen evolution in aqueous solution was conducted with dispersed nanoparticulate manganese oxides and a standard ruthenium dye photo-oxidant system. No Ru was absorbed on the catalyst surface as observed by XPS and EDX. The post reaction atomic structure was completely preserved with no amorphization, as observed by HRTEM. Catalytic activities, normalized to surface area (BET), decrease in the series Mn2O3 > Mn3O4 ≫ λ-MnO2, where the latter is derived from spinel LiMn2O4 following partial Li(+) removal. No catalytic activity is observed from LiMn2O4 and four of the MnO2 polymorphs, in contrast to some literature reports with polydispersed manganese oxides and electro-deposited films. Catalytic activity within the eight examined Mn oxides was found exclusively for (distorted) cubic phases, Mn2O3 (bixbyite), Mn3O4 (hausmannite), and λ-MnO2 (spinel), all containing Mn(III) possessing longer Mn-O bonds between edge-sharing MnO6 octahedra. Electronically degenerate Mn(III) has antibonding electronic configuration e(g)(1) which imparts lattice distortions due to the Jahn-Teller effect that are hypothesized to contribute to structural flexibility important for catalytic turnover in water oxidation at the surface.

  15. Challenging the knowledge bio-based fisheries of tropical tuna stocks: assessing genomic population structure in yellowfin (Thunnus albacares

    Directory of Open Access Journals (Sweden)

    Carlo Pecoraro

    2014-06-01

    The YFT genetic population structure will be investigated at global scale (between- and within-ocean, using next-generation sequencing (2b-RAD method for genotyping by sequencing through examination of Single Nucleotide Polymorphisms (SNPs. This approach can represent a major advancement over classical techniques used until now (i.e. based on allozymes, DNA microsatellites and mitochondrial DNA in order to reveal the YFT stock structure between and within each ocean. The novel genomic data that will be generated can potentially reveal YFT population structure at a level not possible through classical latter approaches with significant implication for YFT stock assessment and management. In fact a carelessness of the proper genetic structure might lead to the over-exploitation and depletion of some populations with dramatic consequences for the long-term conservation and sustainable use of YFT stocks.

  16. The mitochondrial genome of soybean reveals complex genome structures and gene evolution at intercellular and phylogenetic levels.

    Directory of Open Access Journals (Sweden)

    Shengxin Chang

    Full Text Available Determining mitochondrial genomes is important for elucidating vital activities of seed plants. Mitochondrial genomes are specific to each plant species because of their variable size, complex structures and patterns of gene losses and gains during evolution. This complexity has made research on the soybean mitochondrial genome difficult compared with its nuclear and chloroplast genomes. The present study helps to solve a 30-year mystery regarding the most complex mitochondrial genome structure, showing that pairwise rearrangements among the many large repeats may produce an enriched molecular pool of 760 circles in seed plants. The soybean mitochondrial genome harbors 58 genes of known function in addition to 52 predicted open reading frames of unknown function. The genome contains sequences of multiple identifiable origins, including 6.8 kb and 7.1 kb DNA fragments that have been transferred from the nuclear and chloroplast genomes, respectively, and some horizontal DNA transfers. The soybean mitochondrial genome has lost 16 genes, including nine protein-coding genes and seven tRNA genes; however, it has acquired five chloroplast-derived genes during evolution. Four tRNA genes, common among the three genomes, are derived from the chloroplast. Sizeable DNA transfers to the nucleus, with pericentromeric regions as hotspots, are observed, including DNA transfers of 125.0 kb and 151.6 kb identified unambiguously from the soybean mitochondrial and chloroplast genomes, respectively. The soybean nuclear genome has acquired five genes from its mitochondrial genome. These results provide biological insights into the mitochondrial genome of seed plants, and are especially helpful for deciphering vital activities in soybean.

  17. Genome Structural Diversity among 31 Bordetella pertussis Isolates from Two Recent U.S. Whooping Cough Statewide Epidemics.

    Science.gov (United States)

    Bowden, Katherine E; Weigand, Michael R; Peng, Yanhui; Cassiday, Pamela K; Sammons, Scott; Knipe, Kristen; Rowe, Lori A; Loparev, Vladimir; Sheth, Mili; Weening, Keeley; Tondella, M Lucia; Williams, Margaret M

    2016-01-01

    During 2010 and 2012, California and Vermont, respectively, experienced statewide epidemics of pertussis with differences seen in the demographic affected, case clinical presentation, and molecular epidemiology of the circulating strains. To overcome limitations of the current molecular typing methods for pertussis, we utilized whole-genome sequencing to gain a broader understanding of how current circulating strains are causing large epidemics. Through the use of combined next-generation sequencing technologies, this study compared de novo, single-contig genome assemblies from 31 out of 33 Bordetella pertussis isolates collected during two separate pertussis statewide epidemics and 2 resequenced vaccine strains. Final genome architecture assemblies were verified with whole-genome optical mapping. Sixteen distinct genome rearrangement profiles were observed in epidemic isolate genomes, all of which were distinct from the genome structures of the two resequenced vaccine strains. These rearrangements appear to be mediated by repetitive sequence elements, such as high-copy-number mobile genetic elements and rRNA operons. Additionally, novel and previously identified single nucleotide polymorphisms were detected in 10 virulence-related genes in the epidemic isolates. Whole-genome variation analysis identified state-specific variants, and coding regions bearing nonsynonymous mutations were classified into functional annotated orthologous groups. Comprehensive studies on whole genomes are needed to understand the resurgence of pertussis and develop novel tools to better characterize the molecular epidemiology of evolving B. pertussis populations. IMPORTANCE Pertussis, or whooping cough, is the most poorly controlled vaccine-preventable bacterial disease in the United States, which has experienced a resurgence for more than a decade. Once viewed as a monomorphic pathogen, B. pertussis strains circulating during epidemics exhibit diversity visible on a genome structural

  18. Genome Wide Linkage Analysis of 972 Bipolar Pedigrees Using Single Nucleotide Polymorphisms

    Science.gov (United States)

    Badner, Judith A; Koller, Daniel; Foroud, Tatiana; Edenberg, Howard; Nurnberger, John I; Zandi, Peter P; Willour, Virginia L.; McMahon, Francis J; Potash, James B; Hamshere, Marian; Grozeva, Detelina; Green, Elaine; Kirov, George; Jones, Ian; Jones, Lisa; Craddock, Nicholas; Morris, Derek; Segurado, Ricardo; Gill, Mike; Sadovnick, Dessa; Remick, Ronald; Keck, Paul; Kelsoe, John; Ayub, Muhammad; MacLean, Alan; Blackwood, Douglas; Liu, Chun-Yu; Gershon, Elliot S; McMahon, William; Lyon, Gholson; Robinson, Reid; Ross, Jessica; Byerley, William

    2011-01-01

    Because of the high costs associated with ascertainment of families most linkage studies of Bipolar I disorder (BPI) have used relatively small samples. Moreover, the genetic information content reported in most studies has been less than 0.6. While microsatellite markers spaced every 10 centimorgans typically extract most of the genetic information content for larger multiplex families, they can be less informative for smaller pedigrees especially for affected sib pair kindreds. For these reasons we collaborated to pool family resources and carry out higher density genotyping. Approximately 1100 pedigrees of European ancestry were initially selected for study and were genotyped by the Center for Inherited Disease Research using the Illumina Linkage Panel 12 set of 6090 SNPs. Of the ~1100 families, 972 were informative for further analyses and mean information content was 0.86 after pruning for LD. The 972 kindreds include 2284 cases of BPI disorder, 498 individuals with Bipolar II disorder (BPII) and 702 subjects with Recurrent Major Depression. Three affection status models were considered: ASM1 (BPI and schizoaffective disorder, BP cases (SABP) only), ASM2 (ASM1 cases plus BPII) and ASM3 (ASM2 cases plus Recurrent Major Depression). Both parametric and non-parametric linkage methods were carried out. The strongest findings occurred at 6q21 (Nonparametric Pairs Lod 3.4 for rs1046943 at 119 cM) and 9q21 (Nonparametric Pairs Lod 3.4 for rs722642 at 78 cM) using only BPI and SA, BP cases. Both results met genome-wide significant criteria, although neither was significant after correction for multiple analyses. We also inspected parametric scores for the larger multiplex families to identify possible rare susceptibility loci. In this analysis we observed 59 parametric lods of 2 or greater, many of which are likely to be close to maximum possible scores. While some linkage findings may be false positives the results could help prioritize the search for rare variants

  19. Genomic expression and single-nucleotide polymorphism profiling discriminates chromophobe renal cell carcinoma and oncocytoma

    International Nuclear Information System (INIS)

    Tan, Min-Han; Furge, Kyle A; Kort, Eric; Giraud, Sophie; Ferlicot, Sophie; Vielh, Philippe; Amsellem-Ouazana, Delphine; Debré, Bernard; Flam, Thierry; Thiounn, Nicolas; Zerbib, Marc; Wong, Chin Fong; Benoît, Gérard; Droupy, Stéphane; Molinié, Vincent; Vieillefond, Annick; Tan, Puay Hoon; Richard, Stéphane; Teh, Bin Tean; Tan, Hwei Ling; Yang, Ximing J; Ditlev, Jonathon; Matsuda, Daisuke; Khoo, Sok Kean; Sugimura, Jun; Fujioka, Tomoaki

    2010-01-01

    Chromophobe renal cell carcinoma (chRCC) and renal oncocytoma are two distinct but closely related entities with strong morphologic and genetic similarities. While chRCC is a malignant tumor, oncocytoma is usually regarded as a benign entity. The overlapping characteristics are best explained by a common cellular origin, and the biologic differences between chRCC and oncocytoma are therefore of considerable interest in terms of carcinogenesis, diagnosis and clinical management. Previous studies have been relatively limited in terms of examining the differences between oncocytoma and chromophobe RCC. Gene expression profiling using the Affymetrix HGU133Plus2 platform was applied on chRCC (n = 15) and oncocytoma specimens (n = 15). Supervised analysis was applied to identify a discriminatory gene signature, as well as differentially expressed genes. High throughput single-nucleotide polymorphism (SNP) genotyping was performed on independent samples (n = 14) using Affymetrix GeneChip Mapping 100 K arrays to assess correlation between expression and gene copy number. Immunohistochemical validation was performed in an independent set of tumors. A novel 14 probe-set signature was developed to classify the tumors internally with 93% accuracy, and this was successfully validated on an external data-set with 94% accuracy. Pathway analysis highlighted clinically relevant dysregulated pathways of c-erbB2 and mammalian target of rapamycin (mTOR) signaling in chRCC, but no significant differences in p-AKT or extracellular HER2 expression was identified on immunohistochemistry. Loss of chromosome 1p, reflected in both cytogenetic and expression analysis, is common to both entities, implying this may be an early event in histogenesis. Multiple regional areas of cytogenetic alterations and corresponding expression biases differentiating the two entities were identified. Parafibromin, aquaporin 6, and synaptogyrin 3 were novel immunohistochemical markers effectively discriminating

  20. Size dependent structural and polymorphic transitions in ZnO: from nanocluster to bulk.

    Science.gov (United States)

    Viñes, Francesc; Lamiel-Garcia, Oriol; Illas, Francesc; Bromley, Stefan T

    2017-07-20

    We report on an extensive survey of (ZnO) N nanostructures ranging from bottom-up generated nanoclusters to top-down nanoparticles cuts from bulk polymorphs. The obtained results enable us to follow the energetic preferences of structure and polymorphism in (ZnO) N systems with N varying between 10-1026. This size range encompasses small nanoclusters with 10s of atoms and nanoparticles with 100s of atoms, which we also compare with appropriate bulk limits. In all cases the nanostructures and bulk systems are optimized using accurate all-electron, relativistic density functional theory based calculations with numeric atom centered orbital basis sets. Specifically, sets of five families of (ZnO) N species are considered: single-layered and multi-layered nanocages, and bulk cut nanoparticles from the sodalite (SOD), body centered tetragonal (BCT), and wurtzite (WZ) ZnO polymorphs. Using suitable fits to interpolate and extrapolate these data allows us to assess the size-dependent energetic stabilities of each family. With increasing size our results indicate a progressive change in energetic stability from single-layered to multi-layered cage-like nanoclusters. For nanoparticles of around 2.6 nm diameter we identify a transitional region where multi-layered cages, SOD, and BCT nanostructures are very similar in energetic stability. This transition size also marks the size regime at which bottom-up nanoclusters give way to top-down bulk-cut nanoparticles. Eventually, a final crossover is found where the most stable WZ-ZnO polymorph begins to energetically dominate at N ∼ 2200. This size corresponds to an approximate nanoparticle diameter of 4.7 nm, in line with experiments reporting the observation of wurtzite crystallinity in isolated ligand-free ZnO nanoparticles of 4-5 nm size or larger.

  1. Variable Virulence and Efficacy of BCG Vaccine Strains in Mice and Correlation With Genome Polymorphisms

    Science.gov (United States)

    Zhang, Lu; Ru, Huan-wei; Chen, Fu-zeng; Jin, Chun-yan; Sun, Rui-feng; Fan, Xiao-yong; Guo, Ming; Mai, Jun-tao; Xu, Wen-xi; Lin, Qing-xia; Liu, Jun

    2016-01-01

    Bacille Calmette–Guérin (BCG), an attenuated strain of Mycobacterium bovis, is the only vaccine available for tuberculosis (TB) control. However, BCG is not an ideal vaccine and has two major limitations: BCG exhibits highly variable effectiveness against the development of TB both in pediatric and adult populations and can cause disseminated BCG disease in immunocompromised individuals. BCG comprises a number of substrains that are genetically distinct. Whether and how these genetic differences affect BCG efficacy remains largely unknown. In this study, we performed comparative analyses of the virulence and efficacy of 13 BCG strains, representing different genetic lineages, in SCID and BALB/c mice. Our results show that BCG strains of the DU2 group IV (BCG-Phipps, BCG-Frappier, BCG-Pasteur, and BCG-Tice) exhibit the highest levels of virulence, and BCG strains of the DU2 group II (BCG-Sweden, BCG-Birkhaug) are among the least virulent group. These distinct levels of virulence may be explained by strain-specific duplications and deletions of genomic DNA. There appears to be a general trend that more virulent BCG strains are also more effective in protection against Mycobacterium tuberculosis challenge. Our findings have important implications for current BCG vaccine programs and for future TB vaccine development. PMID:26643797

  2. Development and validation of cross-transferable and polymorphic DNA markers for detecting alien genome introgression in Oryza sativa from Oryza brachyantha.

    Science.gov (United States)

    Ray, Soham; Bose, Lotan K; Ray, Joshitha; Ngangkham, Umakanta; Katara, Jawahar L; Samantaray, Sanghamitra; Behera, Lambodar; Anumalla, Mahender; Singh, Onkar N; Chen, Meingsheng; Wing, Rod A; Mohapatra, Trilochan

    2016-08-01

    African wild rice Oryza brachyantha (FF), a distant relative of cultivated rice Oryza sativa (AA), carries genes for pests and disease resistance. Molecular marker assisted alien gene introgression from this wild species to its domesticated counterpart is largely impeded due to the scarce availability of cross-transferable and polymorphic molecular markers that can clearly distinguish these two species. Availability of the whole genome sequence (WGS) of both the species provides a unique opportunity to develop markers, which are cross-transferable. We observed poor cross-transferability (~0.75 %) of O. sativa specific sequence tagged microsatellite (STMS) markers to O. brachyantha. By utilizing the genome sequence information, we developed a set of 45 low cost PCR based co-dominant polymorphic markers (STS and CAPS). These markers were found cross-transferrable (84.78 %) between the two species and could distinguish them from each other and thus allowed tracing alien genome introgression. Finally, we validated a Monosomic Alien Addition Line (MAAL) carrying chromosome 1 of O. brachyantha in O. sativa background using these markers, as a proof of concept. Hence, in this study, we have identified a set molecular marker (comprising of STMS, STS and CAPS) that are capable of detecting alien genome introgression from O. brachyantha to O. sativa.

  3. Fine population structure analysis method for genomes of many.

    Science.gov (United States)

    Pan, Xuedong; Wang, Yi; Wong, Emily H M; Telenti, Amalio; Venter, J Craig; Jin, Li

    2017-10-03

    Fine population structure can be examined through the clustering of individuals into subpopulations. The clustering of individuals in large sequence datasets into subpopulations makes the calculation of subpopulation specific allele frequency possible, which may shed light on selection of candidate variants for rare diseases. However, as the magnitude of the data increases, computational burden becomes a challenge in fine population structure analysis. To address this issue, we propose fine population structure analysis (FIPSA), which is an individual-based non-parametric method for dissecting fine population structure. FIPSA maximizes the likelihood ratio of the contingency table of the allele counts multiplied by the group. We demonstrated that its speed and accuracy were superior to existing non-parametric methods when the simulated sample size was up to 5,000 individuals. When applied to real data, the method showed high resolution on the Human Genome Diversity Project (HGDP) East Asian dataset. FIPSA was independently validated on 11,257 human genomes. The group assignment given by FIPSA was 99.1% similar to those assigned based on supervised learning. Thus, FIPSA provides high resolution and is compatible with a real dataset of more than ten thousand individuals.

  4. Target selection and deselection at the Berkeley Structural Genomics Center.

    Science.gov (United States)

    Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E

    2006-02-01

    At the Berkeley Structural Genomics Center (BSGC), our goal is to obtain a near-complete structural complement of proteins in the minimal organisms Mycoplasma genitalium and M. pneumoniae, two closely related pathogens. Current targets for structure determination have been selected in six major stages, starting with those predicted to be most tractable to high throughput study and likely to yield new structural information. We report on the process used to select these proteins, as well as our target deselection procedure. Target deselection reduces experimental effort by eliminating targets similar to those recently solved by the structural biology community or other centers. We measure the impact of the 69 structures solved at the BSGC as of July 2004 on structure prediction coverage of the M. pneumoniae and M. genitalium proteomes. The number of Mycoplasma proteins for which the fold could first be reliably assigned based on structures solved at the BSGC (24 M. pneumoniae and 21 M. genitalium) is approximately 25% of the total resulting from work at all structural genomics centers and the worldwide structural biology community (94 M. pneumoniae and 86 M. genitalium) during the same period. As the number of structures contributed by the BSGC during that period is less than 1% of the total worldwide output, the benefits of a focused target selection strategy are apparent. If the structures of all current targets were solved, the percentage of M. pneumoniae proteins for which folds could be reliably assigned would increase from approximately 57% (391 of 687) at present to around 80% (550 of 687), and the percentage of the proteome that could be accurately modeled would increase from around 37% (254 of 687) to about 64% (438 of 687). In M. genitalium, the percentage of the proteome that could be structurally annotated based on structures of our remaining targets would rise from 72% (348 of 486) to around 76% (371 of 486), with the percentage of accurately modeled

  5. Estimating Additive and Non-Additive Genetic Variances and Predicting Genetic Merits Using Genome-Wide Dense Single Nucleotide Polymorphism Markers

    DEFF Research Database (Denmark)

    Su, Guosheng; Christensen, Ole Fredslund; Ostersen, Tage

    2012-01-01

    genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP) markers. In addition...... (MAD), and 4) a full model including all three genetic components (MAED). Estimates of narrowsense heritability were 0.397, 0.373, 0.379 and 0.357 for models MA, MAE, MAD and MAED, respectively. Estimated dominance variance and additive by additive epistatic variance accounted for 5.6% and 9.......5% of the total phenotypic variance, respectively. Based on model MAED, the estimate of broad-sense heritability was 0.506. Reliabilities of genomic predicted breeding values for the animals without performance records were 28.5%, 28.8%, 29.2% and 29.5% for models MA, MAE, MAD and MAED, respectively. In addition...

  6. Genetic diversity and structure analysis based on hordein protein polymorphism in barley landrace populations from jordan

    International Nuclear Information System (INIS)

    Baloch, A.W.; Ali, M.; Baloch, A.M.; Mangan, B.U.N.; Song, W

    2014-01-01

    Jordan is unanimously considered to be one of the centers of genetic diversity for barley, where wild and landraces of barley has been grown under different climatic conditions. The genetic diversity and genetic structure based on hordein polymorphism was assessed in 90 different accessions collected from four different sites of Jordan. A-PAGE was used to reveal hordein polymorphism among the genotypes. A total of 29 distinct bands were identified, out of them 9 bands were distinguished for D, 11 for C, and 9 for the B hordein regions. The observed genetic similarity was an exceptionally high between the populations than expected, which is probably due to high gene flow estimated between them. The genetic diversity parameters were not differ largely among the populations, indicating that local selection of a particular site did not play a key role in shaping genetic diversity. Analysis of molecular variance (AMOVA) revealed significant population structure when accessions were structured according to population site. There was 94% of hordein variation resided within the populations and only 8% present among the populations. Both Bayesian and Principale Coordinate Analysis (PCoA) concordantly demonstrated admixture genotypes of the landraces barley populations. Consequently, none of the population found to be clustered separately according to its population site. It is concluded that this approach can be useful to explore the germplasm for genetic diversity but perhaps is not suitable for determining phylogenic relations in barley. (author)

  7. Structures of uncharacterised polymorphs of gallium oxide from total neutron diffraction.

    Science.gov (United States)

    Playford, Helen Y; Hannon, Alex C; Barney, Emma R; Walton, Richard I

    2013-02-18

    A structural investigation is reported of polymorphs of Ga(2)O(3) that, despite much interest in their properties, have hitherto remained uncharacterised due to structural disorder. The most crystalline sample yet reported of γ-Ga(2)O(3) was prepared by solvothermal oxidation of gallium metal in ethanolamine. Structure refinement using the Rietveld method reveals γ-Ga(2)O(3) has a defect Fd3m spinel structure, while pair distribution function analysis shows that the short-range structure is better modelled with local F43m symmetry. In further solvothermal oxidation reactions a novel gallium oxyhydroxide, Ga(5)O(7)(OH), is formed, the thermal decomposition of which reveals a new, transient gallium oxide polymorph, κ-Ga(2)O(3), before transformation into β-Ga(2)O(3). In contrast, the thermal decomposition of Ga(NO(3))(3)·9H(2)O first forms ε-Ga(2)O(3) and then β-Ga(2)O(3). Examination of in situ thermodiffraction data shows that ε-Ga(2)O(3) is always contaminated with β-Ga(2)O(3) and with this knowledge a model for its structure was deduced and refined--space group P6(3)mc with a ratio of tetrahedral/octahedral gallium of 2.2:1 in close-packed oxide layers. Importantly, thermodiffraction provides no evidence for the existence of the speculated bixbyite structured δ-Ga(2)O(3); at the early stages of thermal decomposition of Ga(NO(3))(3)·9H(2)O the first distinct phase formed is merely small particles of ε-Ga(2)O(3). Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Macromolecular structure determination in the post-genome era

    CERN Document Server

    Kuhn, P

    2001-01-01

    Recent advances in genetics, molecular biology and crystallographic instrumentation and methodology have led to a revolution in the field of Structural Molecular Biology (SMB). These combined advances have paved the way to a more complete and detailed understanding of the biological macromolecules that make up an organism, both in terms of their individual functions and also the interactions between them. In this paper we describe a large-scale, genomic approach to the three-dimensional structure determination of macromolecules and their complexes, using high-throughput methodology to streamline all aspects of the process. This task requires the development of automated high-intensity synchrotron beam lines for X-ray diffraction data collection from single crystal samples. Furthermore, these beam lines must be operated within a sophisticated software and hardware environment, which is capable of delivering a completely automated structure determination pipeline. The SMB resource at SSRL is developing a system...

  9. Refining the structure and content of clinical genomic reports.

    Science.gov (United States)

    Dorschner, Michael O; Amendola, Laura M; Shirts, Brian H; Kiedrowski, Lesli; Salama, Joseph; Gordon, Adam S; Fullerton, Stephanie M; Tarczy-Hornoch, Peter; Byers, Peter H; Jarvik, Gail P

    2014-03-01

    To effectively articulate the results of exome and genome sequencing we refined the structure and content of molecular test reports. To communicate results of a randomized control trial aimed at the evaluation of exome sequencing for clinical medicine, we developed a structured narrative report. With feedback from genetics and non-genetics professionals, we developed separate indication-specific and incidental findings reports. Standard test report elements were supplemented with research study-specific language, which highlighted the limitations of exome sequencing and provided detailed, structured results, and interpretations. The report format we developed to communicate research results can easily be transformed for clinical use by removal of research-specific statements and disclaimers. The development of clinical reports for exome sequencing has shown that accurate and open communication between the clinician and laboratory is ideally an ongoing process to address the increasing complexity of molecular genetic testing. © 2014 Wiley Periodicals, Inc.

  10. Evolutionary pathway of the Beijing lineage of Mycobacterium tuberculosis based on genomic deletions and mutT genes polymorphisms.

    Science.gov (United States)

    Rindi, Laura; Lari, Nicoletta; Cuccu, Barbara; Garzelli, Carlo

    2009-01-01

    Among the genotypes that prevail in the modern spectrum of Mycobacterium tuberculosis strains, the Beijing genotype is the one that causes major concern, as it is geographically widespread and it is considered hypervirulent. Comparative genomic studies have shown that Beijing strains have principally evolved through mechanisms of deletion of chromosomal regions, designated regions of difference (RD), and mutations. In this paper, we aimed to determine the evolutionary history of Beijing strains through the analysis of polymorphisms generated by deletions of large specific sequences, i.e., RD105, RD181, RD150, and RD142, and by single nucleotide substitutions in genes mutT4 and mutT2, coding for DNA repair enzymes. Based on the molecular characteristics of a collection of Beijing strains recently isolated in Tuscany, Italy, we propose a phylogenetic reconstruction of the Beijing family. According to our model, the Beijing family evolved from a M. tuberculosis progenitor following deletion of the RD207 region, an event responsible for the loss of spacers 1-34 in the direct repeat (DR) locus. The major lineages of the Beijing family then evolved via subsequent deletions of regions RD105, RD181 and RD150. In the most ancient evolutionary lineages genes mutT4 and mutT2 were in wild type configuration; the mutT4 mutation was acquired subsequent to the RD181 deletion in a progenitor strain that, in turn, gave rise to a sublineage bearing the mutT2 mutation. Within the major branches of the Beijing family, deletion of additional spacers in the DR locus led to evolution of sublineages characterized by different spoligotypes. Our evolutionary model of the Beijing family provides a deeper framework than previously proposed for epidemiologic and phylogenetic studies of circulating M. tuberculosis Beijing strains, thus allowing a more systematic and comprehensive evaluation of the relevance of Beijing strain variability.

  11. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset.

    Science.gov (United States)

    Ignatieva, Elena V; Levitsky, Victor G; Yudin, Nikolay S; Moshkin, Mikhail P; Kolchanov, Nikolay A

    2014-01-01

    The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors), which are activated by olfactory stimuli (ligands). Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter [a region of DNA about 100-1000 base pairs long located upstream of the transcription start site (TSS)]. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.). In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  12. On the allopolyploid origin and genome structure of the closely related species Hordeum secalinum and Hordeum capense inferred by molecular karyotyping.

    Science.gov (United States)

    Cuadrado, Ángeles; de Bustos, Alfredo; Jouve, Nicolás

    2017-08-01

    To provide additional information to the many phylogenetic analyses conducted within Hordeum , here the origin and interspecific affinities of the allotetraploids Hordeum secalinum and Hordeum capense were analysed by molecular karyotyping. Karyotypes were determined using genomic in situ hybridization (GISH) to distinguish the sub-genomes and , plus fluorescence in situ hybridization (FISH)/non-denaturing (ND)-FISH to determine the distribution of ten tandem repetitive DNA sequences and thus provide chromosome markers. Each chromosome pair in the six accessions analysed was identified, allowing the establishment of homologous and putative homeologous relationships. The low-level polymorphism observed among the H. secalinum accessions contrasted with the divergence recorded for the sub-genome of the H. capense accessions. Although accession H335 carries an intergenomic translocation, its chromosome structure was indistinguishable from that of H. secalinum . Hordeum secalinum and H. capense accession H335 share a hybrid origin involving Hordeum marinum subsp. gussoneanum as the genome donor and an unidentified genome progenitor. Hordeum capense accession BCC2062 either diverged, with remodelling of the sub-genome, or its genome was donated by a now extinct ancestor. A scheme of probable evolution shows the intricate pattern of relationships among the Hordeum species carrying the genome (including all H. marinum taxa and the hexaploid Hordeum brachyantherum ). © The Author 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  13. Genetic diversity and structure of elite cotton germplasm (Gossypium hirsutum L.) using genome-wide SNP data.

    Science.gov (United States)

    Ai, XianTao; Liang, YaJun; Wang, JunDuo; Zheng, JuYun; Gong, ZhaoLong; Guo, JiangPing; Li, XueYuan; Qu, YanYing

    2017-10-01

    Cotton (Gossypium spp.) is the most important natural textile fiber crop, and Gossypium hirsutum L. is responsible for 90% of the annual cotton crop in the world. Information on cotton genetic diversity and population structure is essential for new breeding lines. In this study, we analyzed population structure and genetic diversity of 288 elite Gossypium hirsutum cultivar accessions collected from around the world, and especially from China, using genome-wide single nucleotide polymorphisms (SNP) markers. The average polymorphsim information content (PIC) was 0.25, indicating a relatively low degree of genetic diversity. Population structure analysis revealed extensive admixture and identified three subgroups. Phylogenetic analysis supported the subgroups identified by STRUCTURE. The results from both population structure and phylogenetic analysis were, for the most part, in agreement with pedigree information. Analysis of molecular variance revealed a larger amount of variation was due to diversity within the groups. Establishment of genetic diversity and population structure from this study could be useful for genetic and genomic analysis and systematic utilization of the standing genetic variation in upland cotton.

  14. The pattern of polymorphism in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    2005-07-01

    Full Text Available We resequenced 876 short fragments in a sample of 96 individuals of Arabidopsis thaliana that included stock center accessions as well as a hierarchical sample from natural populations. Although A. thaliana is a selfing weed, the pattern of polymorphism in general agrees with what is expected for a widely distributed, sexually reproducing species. Linkage disequilibrium decays rapidly, within 50 kb. Variation is shared worldwide, although population structure and isolation by distance are evident. The data fail to fit standard neutral models in several ways. There is a genome-wide excess of rare alleles, at least partially due to selection. There is too much variation between genomic regions in the level of polymorphism. The local level of polymorphism is negatively correlated with gene density and positively correlated with segmental duplications. Because the data do not fit theoretical null distributions, attempts to infer natural selection from polymorphism data will require genome-wide surveys of polymorphism in order to identify anomalous regions. Despite this, our data support the utility of A. thaliana as a model for evolutionary functional genomics.

  15. The complete chloroplast genome of Primulina and two novel strategies for development of high polymorphic loci for population genetic and phylogenetic studies.

    Science.gov (United States)

    Feng, Chao; Xu, Meizhen; Feng, Chen; von Wettberg, Eric J B; Kang, Ming

    2017-11-07

    Primulina Hance is an emerging model for studying evolutionary divergence, adaptation and speciation of the karst flora. However, phylogenetic relationships within the genus have not been resolved due to low variation detected in the cpDNA regions. Chloroplast genomes can provide important information for phylogenetic and population genetic studies. Recent advances in next-generation sequencing (NGS) techniques greatly facilitate sequencing whole chloroplast genomes for multiple individuals. Consequently, novel strategies for development of highly polymorphic loci for population genetic and phylogenetic studies based on NGS data are needed. For development of high polymorphic loci for population genetic and phylogenetic studies, two novel strategies are proposed here. The first protocol develops lineage-specific highly variable markers from the true high variation regions (Con_Seas) across whole cp genomes, instead of traditional noncoding regions. The pipeline has been integrated into a single perl script, and named "Con_Sea_Identification_and_PIC_Calculation". The second method assembles chloroplast fragments (poTs) and sub-super-marker (CpContigs) through our "SACRing" pipeline. This approach can fundamentally alter the strategies used in phylogenetic and population genetic studies based on cp markers, facilitating a transition from traditional Sanger sequencing to RAD-Seq. Both of these scripts are available at https://github.com/scbgfengchao/ . Three complete Primulina chloroplast genomes were assembled from genome survey data, and then two novel strategies were developed to yield highly polymorphic markers. For experimental evaluation of the first protocol, a set of Primulina species were used for PCR amplification. The results showed that these newly developed markers are more variable than traditional ones, and seem to be a better choice for phylogenetic and population studies in Primulin a. The second method was also successfully applied in population

  16. Genomic structure, expression and association study of the porcine FSD2.

    Science.gov (United States)

    Lim, Kyu-Sang; Lee, Kyung-Tai; Lee, Si-Woo; Chai, Han-Ha; Jang, Gulwon; Hong, Ki-Chang; Kim, Tae-Hun

    2016-09-01

    The fibronectin type III and SPRY domain containing 2 (FSD2) on porcine chromosome 7 is considered a candidate gene for pork quality, since its two domains, which were present in fibronectin and ryanodine receptor. The fibronectin type III and SPRY domains were first identified in fibronectin and ryanodine receptor, respectively, which are candidate genes for meat quality. The aim of this study was to elucidate the genomic structure of FSD2 and functions of single nucleotide polymorphisms (SNPs) within FSD2 that are related to meat quality in pigs. Using a bacterial artificial chromosome clone sequence, we revealed that porcine FSD2 consisted of 13 exons encoding 750 amino acids. In addition, FSD2 was expressed in heart, longissimus dorsi muscle, psoas muscle, and tendon among 23 kinds of porcine tissues tested. A total of ten SNPs, including four missense mutations, were identified in the exonic region of FSD2, and two major haplotypes were obtained based on the SNP genotypes of 633 Berkshire pigs. Both haplotypes were associated significantly with intramuscular fat content (IMF, P meat color, affecting yellowness (P = 0.002). These haplotype effects were further supported by the alteration of putative protein structures with amino acid substitutions. Taken together, our results suggest that FSD2 haplotypes are involved in regulating meat quality including IMF, MP, and meat color in pigs, and may be used as meaningful molecular makers to identify pigs with preferable pork quality.

  17. Structures of mono-unsaturated triacylglycerols. I. The beta1 polymorph.

    Science.gov (United States)

    van Mechelen, Jan B; Peschar, Rene; Schenk, Henk

    2006-12-01

    The crystal structures of the beta1 polymorphs of mono-unsaturated triacylglycerols have been solved from high-resolution laboratory and synchrotron powder diffraction data for five pure compounds, the 1,3-dimyristoyl-2-oleoylglycerol (beta1-MOM), 1,3-dipalmitoyl-2-oleoylglycerol (beta1-POP), 1,3-distearoyl-2-oleoylglycerol (beta1-SOS), 1-palmitoyl-2-oleoyl-3-stearoylglycerol (beta1-POS), 1-stearoyl-2-oleoyl-3-arachidoylglycerol (beta1-SOA) and three mixtures: the co-crystallized 1:1 molar mixture of SOS and POP [beta1-SOS/POP (1:1)] and two cocoa butters from Bahia and Ivory Coast, both in their beta-VI (=beta1) polymorph. All eight beta1 structures crystallized in the space group (P2(1)/n) and have two short cell axes (5.44-5.46 and 8.18-8.22 A), as well as a very long b axis (112-135 A). The dominant-zone problem in the indexing of the powder patterns was solved with the special brute-force indexing routine LSQDETC from the POWSIM program. Structures were solved using the direct-space parallel-tempering method FOX and refined with GSAS. Along the b axis, alternations of inversion-centre-related ;three-packs' can be discerned. Each ;three-pack' has a central oleic zone, with oleic acyl chains of the molecules being packed together, that is sandwiched between two saturated-chain zones. The conformation of the triacylglycerol molecules is relatively ;flat' because the least-square planes through the saturated chains and those through the saturated parts of the olein chain are parallel. The solution of the beta1 structures is a step forward towards understanding the mechanism of fat-bloom formation in dark chocolate and has led to a reexamination of the beta2 structural model [see van Mechelen et al. (2006). Acta Cryst. B62, 1131-1138].

  18. Training set optimization under population structure in genomic selection.

    Science.gov (United States)

    Isidro, Julio; Jannink, Jean-Luc; Akdemir, Deniz; Poland, Jesse; Heslot, Nicolas; Sorrells, Mark E

    2015-01-01

    Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.

  19. Recognizing genes and other components of genomic structure

    Energy Technology Data Exchange (ETDEWEB)

    Burks, C. (Los Alamos National Lab., NM (USA)); Myers, E. (Arizona Univ., Tucson, AZ (USA). Dept. of Computer Science); Stormo, G.D. (Colorado Univ., Boulder, CO (USA). Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  20. Structural constraints in the packaging of bluetongue virus genomic segments.

    Science.gov (United States)

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C; Roy, Polly

    2014-10-01

    The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by biochemical data analysis suggested that a conformational motif formed by interaction of the 5' and 3' ends of the molecule was necessary and sufficient for packaging. A similar structural signal was also identified in S8 of BTV serotype 1. Furthermore, the same conformational analysis of secondary structures for positive-sense ssRNAs was used to generate a chimeric segment that maintained the putative packaging motif but contained unrelated internal sequences. This chimeric segment was packaged successfully, confirming that the motif identified directs the correct packaging of the segment. © 2014 The Authors.

  1. The effect of single nucleotide polymorphisms in G-rich regions of high-risk human papillomaviruses on structural diversity of DNA.

    Science.gov (United States)

    Marušič, Maja; Hošnjak, Lea; Krafčikova, Petra; Poljak, Mario; Viglasky, Viktor; Plavec, Janez

    2017-05-01

    Infection with high-risk human papillomaviruses (HPVs) can lead to development of cancer of the head and neck and anogenital regions. G-rich sequences found in genomes of high-risk HPVs can fold into non-canonical secondary structures that could serve as 3D motifs distinct from double-stranded DNA and present recognition sites for ligands and opportunity for gene expression modulation. Combination of UV, CD and NMR spectroscopy and PAGE electrophoresis were used as they offer complementary insights into structural changes of G-rich oligonucleotides. G-rich region of HPV16 is shown to preferentially form hairpin structures, while regions of HPV18, HPV52 and HPV58 fold into four-stranded DNA structures called G-quadruplexes. Single nucleotide polymorphisms found in G-rich sequences have been found to promote formation of hairpin structures of HPV16 and have affected number of species formed in G-rich region of HPV52, whereas they have exhibited minimal effect on the formation of HPV18 and HPV58 G-quadruplex structures. These structural changes were reflected in differences in apparent thermal stabilities. Potential of G-rich sequences as drug targets was evaluated based on the results of the current study. HPV16 and HPV18 are considered less appropriate targets due to several single nucleotide polymorphisms and low stability, respectively. On the other hand, HPV52 and HPV58 could be used for small-molecule mediated stabilization. G-rich sequences occurring in high-risk HPVs can fold into hairpin and G-quadruplex structures that could be potentially utilized as drug targets. This article is part of a Special Issue entitled "G-quadruplex" Guest Editor: Dr. Concetta Giancola and Dr. Daniela Montesarchio. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  3. Inference of the Genetic Polymorphisms of CYP2D6 in Six Subtribes of the Malaysian Orang Asli from Whole-Genome Sequencing Data.

    Science.gov (United States)

    Yu, Choo Yee; Ang, Geik Yong; Subramaniam, Vinothini; Johari James, Richard; Ahmad, Aminuddin; Abdul Rahman, Thuhairah; Mohd Nor, Fadzilah; Shaari, Syahrul Azlin; Teh, Lay Kek; Salleh, Mohd Zaki

    2017-07-01

    CYP2D6 is one of the major enzymes in the cytochrome P450 monooxygenase system. It metabolizes ∼25% of prescribed drugs and hence, the genetic diversity of a CYP2D6 gene has continued to be of great interest to the medical and pharmaceutical industries. This study was designed to perform a systematic analysis of the CYP2D6 gene in six subtribes of the Malaysian Orang Asli. Genomic DNAs were extracted from the blood samples followed by whole-genome sequencing. The reads were aligned to the reference human genome hg19 and variants in the CYP2D6 gene were analyzed. CYP2D6*5 and duplication of CYP2D6 were analyzed using previously established methods. A total of 72 single nucleotide polymorphisms were identified. CYP2D6*1, *2, *4, *5, *10,*41, and duplication of the gene were found in the Orang Asli, whereby CYP2D6*2 and *41 alleles are reported for the first time in the Malaysian population. The findings in this study provide insights into the genetic polymorphisms of CYP2D6 in the Orang Asli of Peninsular Malaysia.

  4. Characterization of the complete mitochondrial genome and a set of polymorphic microsatellite markers through next-generation sequencing for the brown brocket deer Mazama gouazoubira.

    Science.gov (United States)

    Caparroz, Renato; Mantellatto, Aline M B; Bertioli, David J; Figueiredo, Marina G; Duarte, José Maurício B

    2015-01-01

    The complete mitochondrial genome of the brown brocket deer Mazama gouazoubira and a set of polymorphic microsatellite markers were identified by 454-pyrosequencing. De novo genome assembly recovered 98% of the mitochondrial genome with a mean coverage of 9-fold. The mitogenome consisted of 16,356 base pairs that included 13 protein-coding genes, two ribosomal subunit genes, 22 transfer RNAs and the control region, as found in other deer. The genetic divergence between the mitogenome described here and a previously published report was ∼0.5%, with the control region and ND5 gene showing the highest intraspecific variation. Seven polymorphic loci were characterized using 15 unrelated individuals; there was moderate genetic variation across most loci (mean of 5.6 alleles/locus, mean expected heterozygosity = 0.70), with only one locus deviating significantly from Hardy-Weinberg equilibrium, probably because of null alleles. Marker independence was confirmed with tests for linkage disequilibrium. The genetic variation of the mitogenome and characterization of microsatellite markers will provide useful tools for assessing the phylogeography and population genetic patterns in M. gouazoubira, particularly in the context of habitat fragmentation in South America.

  5. Characterization of the complete mitochondrial genome and a set of polymorphic microsatellite markers through next-generation sequencing for the brown brocket deer Mazama gouazoubira

    Directory of Open Access Journals (Sweden)

    Renato Caparroz

    2015-09-01

    Full Text Available The complete mitochondrial genome of the brown brocket deer Mazama gouazoubira and a set of polymorphic microsatellite markers were identified by 454-pyrosequencing. De novo genome assembly recovered 98% of the mitochondrial genome with a mean coverage of 9-fold. The mitogenome consisted of 16,356 base pairs that included 13 protein-coding genes, two ribosomal subunit genes, 22 transfer RNAs and the control region, as found in other deer. The genetic divergence between the mitogenome described here and a previously published report was ∼0.5%, with the control region and ND5 gene showing the highest intraspecific variation. Seven polymorphic loci were characterized using 15 unrelated individuals; there was moderate genetic variation across most loci (mean of 5.6 alleles/locus, mean expected heterozygosity = 0.70, with only one locus deviating significantly from Hardy-Weinberg equilibrium, probably because of null alleles. Marker independence was confirmed with tests for linkage disequilibrium. The genetic variation of the mitogenome and characterization of microsatellite markers will provide useful tools for assessing the phylogeography and population genetic patterns in M. gouazoubira, particularly in the context of habitat fragmentation in South America.

  6. Influences of the G2350A polymorphism in the ACE Gene on cardiac structure and function of ball game players

    Directory of Open Access Journals (Sweden)

    Jang Yongwoo

    2012-01-01

    Full Text Available Abstract Background Except for the I/D polymorphism in the angiotensin I-converting enzyme (ACE gene, there were few reports about the relationship between other genetic polymorphisms in this gene and the changes in cardiac structure and function of athletes. Thus, we investigated whether the G2350A polymorphism in the ACE gene is associated with the changes in cardiac structure and function of ball game players. Total 85 healthy ball game players were recruited in this study, and they were composed of 35 controls and 50 ball game players, respectively. Cardiac structure and function were measured by 2-D echocardiography, and the G2350A polymorphism in the ACE gene analyzed by the SNaPshot method. Results There were significant differences in left ventricular mass index (LVmassI value among each sporting discipline studied. Especially in the athletes of basketball disciplines, indicated the highest LVmassI value than those of other sporting disciplines studied (p ACE gene in the both controls and ball game players. Conclusions Our data suggests that the G2350A polymorphism in the ACE gene may not significantly contribute to the changes in cardiac structure and function of ball game players, although sporting disciplines of ball game players may influence the changes in LVmassI value of these athletes. Further studies using a larger sample size and other genetic markers in the ACE gene will be needed.

  7. A catalog of neutral and deleterious polymorphism in yeast.

    Directory of Open Access Journals (Sweden)

    Scott W Doniger

    2008-08-01

    Full Text Available The abundance and identity of functional variation segregating in natural populations is paramount to dissecting the molecular basis of quantitative traits as well as human genetic diseases. Genome sequencing of multiple organisms of the same species provides an efficient means of cataloging rearrangements, insertion, or deletion polymorphisms (InDels and single-nucleotide polymorphisms (SNPs. While inbreeding depression and heterosis imply that a substantial amount of polymorphism is deleterious, distinguishing deleterious from neutral polymorphism remains a significant challenge. To identify deleterious and neutral DNA sequence variation within Saccharomyces cerevisiae, we sequenced the genome of a vineyard and oak tree strain and compared them to a reference genome. Among these three strains, 6% of the genome is variable, mostly attributable to variation in genome content that results from large InDels. Out of the 88,000 polymorphisms identified, 93% are SNPs and a small but significant fraction can be attributed to recent interspecific introgression and ectopic gene conversion. In comparison to the reference genome, there is substantial evidence for functional variation in gene content and structure that results from large InDels, frame-shifts, and polymorphic start and stop codons. Comparison of polymorphism to divergence reveals scant evidence for positive selection but an abundance of evidence for deleterious SNPs. We estimate that 12% of coding and 7% of noncoding SNPs are deleterious. Based on divergence among 11 yeast species, we identified 1,666 nonsynonymous SNPs that disrupt conserved amino acids and 1,863 noncoding SNPs that disrupt conserved noncoding motifs. The deleterious coding SNPs include those known to affect quantitative traits, and a subset of the deleterious noncoding SNPs occurs in the promoters of genes that show allele-specific expression, implying that some cis-regulatory SNPs are deleterious. Our results show that

  8. Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Shibaya Taeko

    2010-04-01

    Full Text Available Abstract Background To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. Results The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7× the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67 051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Conclusions Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several

  9. Whole-Genome Characteristics and Polymorphic Analysis of Vietnamese Rice Landraces as a Comprehensive Information Resource for Marker-Assisted Selection

    Directory of Open Access Journals (Sweden)

    Hien Trinh

    2017-01-01

    Full Text Available Next generation sequencing technologies have provided numerous opportunities for application in the study of whole plant genomes. In this study, we present the sequencing and bioinformatic analyses of five typical rice landraces including three indica and two japonica with potential blast resistance. A total of 688.4 million 100 bp paired-end reads have yielded approximately 30-fold coverage to compare with the Nipponbare reference genome. Among them, a small number of reads were mapped to both chromosomes and organellar genomes. Over two million and eight hundred thousand single nucleotide polymorphisms (SNPs and insertions and deletions (InDels in indica and japonica lines have been determined, which potentially have significant impacts on multiple transcripts of genes. SNP deserts, contiguous SNP-low regions, were found on chromosomes 1, 4, and 5 of all genomes of rice examined. Based on the distribution of SNPs per 100 kilobase pairs, the phylogenetic relationships among the landraces have been constructed. This is the first step towards revealing several salient features of rice genomes in Vietnam and providing significant information resources to further marker-assisted selection (MAS in rice breeding programs.

  10. Combined crystal structure prediction and high-pressure crystallization in rational pharmaceutical polymorph screening

    DEFF Research Database (Denmark)

    Neumann, M A; van de Streek, J; Fabbiani, F P A

    2015-01-01

    Organic molecules, such as pharmaceuticals, agro-chemicals and pigments, frequently form several crystal polymorphs with different physicochemical properties. Finding polymorphs has long been a purely experimental game of trial-and-error. Here we utilize in silico polymorph screening in combination...

  11. Determination of the frequency of polymorphisms in genes related to the genome stability maintenance of the population residing at Monte Alegre, PA (Brazil) municipality

    International Nuclear Information System (INIS)

    Hozumi, Cristiny Gomes

    2010-01-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on earth, for man and all living things have always been exposed to these sources. Ionizing radiation is a known genotoxic agent which can affect the genomic stability and genes related to DNA repair may play a role when they have committed certain polymorphism. This study aimed to analyze the frequency of polymorphisms (SNPs) in genes of DNA repair and cell cycle control: hOGG1 (Ser326Cys), XRCC3 (Thr241 Met) and p53 (Arg72Pro) in saliva samples from a population located Monte Alegre, state of Para were collected in August 2008 and 40 samples of men and 46 samples of women, adding a total of 86 samples. By RFLP was determined the frequency of homozygous genotypes and / or heterozygous for polymorphic genes. The I)OGG1 gene was 5% of the allele 326Cys, XRCC3 gene found about 21 % of the allele 241 Met and p53 gene showed 40.8% of the 72Pro allele. And the genotype frequencies of individuals for the three genes were 91.04%, 88.06% and 59.7% for homozygous wild genotype, 5.97%, 11.94% and 22.39% for heterozygote genotype and 2,99%, zero and 17:91% for homozygous polymorphic hOGG1 genes respectively, XRCC3, p53. These values are similar to those found in previous studies. The influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, which is statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology in Monte Alegre, that help to characterization of local population. (author)

  12. A MITE-based genotyping method to reveal hundreds of DNA polymorphisms in an animal genome after a few generations of artificial selection

    Directory of Open Access Journals (Sweden)

    Tetreau Guillaume

    2008-10-01

    Full Text Available Abstract Background For most organisms, developing hundreds of genetic markers spanning the whole genome still requires excessive if not unrealistic efforts. In this context, there is an obvious need for methodologies allowing the low-cost, fast and high-throughput genotyping of virtually any species, such as the Diversity Arrays Technology (DArT. One of the crucial steps of the DArT technique is the genome complexity reduction, which allows obtaining a genomic representation characteristic of the studied DNA sample and necessary for subsequent genotyping. In this article, using the mosquito Aedes aegypti as a study model, we describe a new genome complexity reduction method taking advantage of the abundance of miniature inverted repeat transposable elements (MITEs in the genome of this species. Results Ae. aegypti genomic representations were produced following a two-step procedure: (1 restriction digestion of the genomic DNA and simultaneous ligation of a specific adaptor to compatible ends, and (2 amplification of restriction fragments containing a particular MITE element called Pony using two primers, one annealing to the adaptor sequence and one annealing to a conserved sequence motif of the Pony element. Using this protocol, we constructed a library comprising more than 6,000 DArT clones, of which at least 5.70% were highly reliable polymorphic markers for two closely related mosquito strains separated by only a few generations of artificial selection. Within this dataset, linkage disequilibrium was low, and marker redundancy was evaluated at 2.86% only. Most of the detected genetic variability was observed between the two studied mosquito strains, but individuals of the same strain could still be clearly distinguished. Conclusion The new complexity reduction method was particularly efficient to reveal genetic polymorphisms in Ae. egypti. Overall, our results testify of the flexibility of the DArT genotyping technique and open new

  13. DCDC2 polymorphism is associated with left temporoparietal gray and white matter structures during development.

    Science.gov (United States)

    Darki, Fahimeh; Peyrard-Janvid, Myriam; Matsson, Hans; Kere, Juha; Klingberg, Torkel

    2014-10-22

    Three genes, DYX1C1, DCDC2, and KIAA0319, have been previously associated with dyslexia, neuronal migration, and ciliary function. Three polymorphisms within these genes, rs3743204 (DYX1C1), rs793842 (DCDC2), and rs6935076 (KIAA0319) have also been linked to normal variability of left temporoparietal white matter volume connecting the middle temporal cortex to the angular and supramarginal gyri. Here, we assessed whether these polymorphisms are also related to the cortical thickness of the associated regions during childhood development using a longitudinal dataset of 76 randomly selected children and young adults who were scanned up to three times each, 2 years apart. rs793842 in DCDC2 was significantly associated with the thickness of left angular and supramarginal gyri as well as the left lateral occipital cortex. The cortex was significantly thicker for T-allele carriers, who also had lower white matter volume and lower reading comprehension scores. There was a negative correlation between white matter volume and cortical thickness, but only white matter volume predicted reading comprehension 2 years after scanning. These results show how normal variability in reading comprehension is related to gene, white matter volume, and cortical thickness in the inferior parietal lobe. Possibly, the variability of gray and white matter structures could both be related to the role of DCDC2 in ciliary function, which affects both neuronal migration and axonal outgrowth. Copyright © 2014 the authors 0270-6474/14/3414455-08$15.00/0.

  14. The New Superconductor tP-SrPd2Bi2: Structural Polymorphism and Superconductivity in Intermetallics.

    Science.gov (United States)

    Xie, Weiwei; Seibel, Elizabeth M; Cava, Robert J

    2016-04-04

    We consider a system where structural polymorphism suggests the possible existence of superconductivity through the implied structural instability. SrPd2Bi2 has two polymorphs, which can be controlled by the synthesis temperature: a tetragonal form (CaBe2Ge2-type) and a monoclinic form (BaAu2Sb2-type). Although the crystallographic difference between the two forms may, at first, seem trivial, we show that tetragonal SrPd2Bi2 is superconducting at 2.0 K, whereas monoclinic SrPd2Bi2 is not. We rationalize this finding and place it in context with other 1-2-2 phases.

  15. RNA structural constraints in the evolution of the influenza A virus genome NP segment

    NARCIS (Netherlands)

    A.P. Gultyaev (Alexander); A. Tsyganov-Bodounov (Anton); M.I. Spronken (Monique); S. Van Der Kooij (Sander); R.A.M. Fouchier (Ron); R.C.L. Olsthoorn (René)

    2014-01-01

    textabstractConserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length,

  16. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    Energy Technology Data Exchange (ETDEWEB)

    Arneodo, Alain, E-mail: alain.arneodo@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Vaillant, Cedric, E-mail: cedric.vaillant@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Audit, Benjamin, E-mail: benjamin.audit@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Argoul, Francoise, E-mail: francoise.argoul@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); D' Aubenton-Carafa, Yves, E-mail: daubenton@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France); Thermes, Claude, E-mail: claude.thermes@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France)

    2011-02-15

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  17. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    DEFF Research Database (Denmark)

    Sükösd, Zsuzsanna; Andersen, Ebbe Sloth; Seemann, Ernst Stefan

    2015-01-01

    of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping...

  18. Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

    Science.gov (United States)

    Medina, Ignacio; Montaner, David; Bonifaci, Nuria; Pujana, Miguel Angel; Carbonell, José; Tarraga, Joaquin; Al-Shahrour, Fatima; Dopazo, Joaquin

    2009-01-01

    Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/ PMID:19502494

  19. No evidence of association between structural polymorphism at the dopamine D3 receptor locus and alcoholism in the Japanese

    Energy Technology Data Exchange (ETDEWEB)

    Higuchi, Susumu; Muramatsu, Taro; Matsushita, Sachio [National Institute on Alcoholism, Kanagawa (Japan); Murayama, Masanobu [Akagi Kougen Hospital, Gunma (Japan)

    1996-07-26

    Dopaminergic systems mediate reward mechanisms and are involved in reinforcing self-administration of dependence-forming substances, including alcohol. Studies have reported that polymorphisms of the dopamine D2 receptor, whose structure and function are similar to those of the dopamine D3 receptor, increase the susceptibility to alcoholism. The observations led to the examination of the possible association between a structural polymorphism of the D3 receptor gene and alcoholism. Genotyping results, employing a PCR-RFLP method, showed no difference in allele and genotype frequencies of the D3 BalI polymorphism (Ser{sup 9}/Gly{sup 9}) between Japanese alcoholics and controls. Moreover, these frequencies were not altered in alcoholics with inactive aldehyde dehydrogenase-2 (ALDH2), a well-defined negative risk factor for alcoholism. These results strongly suggest that the dopamine D3 receptor is not associated with alcoholism. 19 refs., 1 fig., 1 tab.

  20. Genomic structure and expression of immunoglobulins in Squamata.

    Science.gov (United States)

    Olivieri, David N; Garet, Elina; Estevez, Olivia; Sánchez-Espinel, Christian; Gambón-Deza, Francisco

    2016-04-01

    The Squamata order represents a major evolutionary reptile lineage, yet the structure and expression of immunoglobulins in this order has been scarcely studied in detail. From the genome sequences of four Squamata species (Gekko japonicus, Ophisaurus gracilis, Pogona vitticeps and Ophiophagus hannah) and RNA-seq datasets from 18 other Squamata species, we identified the immunoglobulins present in these animals as well as the tissues in which they are found. All Squamata have at least three immunoglobulin classes; namely, the immunoglobulins M, D, and Y. Unlike mammals, however, we provide evidence that some Squamata lineages possess more than one Cμ gene which is located downstream from the Cδ gene. The existence of two evolutionary lineages of immunoglobulin Y is shown. Additionally, it is demonstrated that while all Squamata species possess the λ light chain, only Iguanidae species possess the κ light chain. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Structural characterization of genomes by large scale sequence-structure threading

    Directory of Open Access Journals (Sweden)

    Cherkasov Artem

    2004-04-01

    Full Text Available Abstract Background Using sequence-structure threading we have conducted structural characterization of complete proteomes of 37 archaeal, bacterial and eukaryotic organisms (including worm, fly, mouse and human totaling 167,888 genes. Results The reported data represent first rather general evaluation of performance of full sequence-structure threading on multiple genomes providing opportunity to evaluate its general applicability for large scale studies. According to the estimated results the sequence-structure threading has assigned protein folds to more then 60% of eukaryotic, 68% of archaeal and 70% of bacterial proteomes. The repertoires of protein classes, architectures, topologies and homologous superfamilies (according to the CATH 2.4 classification have been established for distant organisms and superkingdoms. It has been found that the average abundance of CATH classes decreases from "alpha and beta" to "mainly beta", followed by "mainly alpha" and "few secondary structures". 3-Layer (aba Sandwich has been characterized as the most abundant protein architecture and Rossman fold as the most common topology. Conclusion The analysis of genomic occurrences of CATH 2.4 protein homologous superfamilies and topologies has revealed the power-law character of their distributions. The corresponding double logarithmic "frequency – genomic occurrence" dependences characteristic of scale-free systems have been established for individual organisms and for three superkingdoms. Supplementary materials to this works are available at 1.

  2. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    Science.gov (United States)

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome. Results Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella. Conclusions When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution. PMID:21619600

  3. Mosaic genome structure of the barley powdery mildew pathogen and conservation of transcriptional programs in divergent hosts

    Science.gov (United States)

    Hacquard, Stéphane; Kracher, Barbara; Maekawa, Takaki; Vernaldi, Saskia; Schulze-Lefert, Paul; Ver Loren van Themaat, Emiel

    2013-01-01

    Barley powdery mildew, Blumeria graminis f. sp. hordei (Bgh), is an obligate biotrophic ascomycete fungal pathogen that can grow and reproduce only on living cells of wild or domesticated barley (Hordeum sp.). Domestication and deployment of resistant barley cultivars by humans selected for amplification of Bgh isolates with different virulence combinations. We sequenced the genomes of two European Bgh isolates, A6 and K1, for comparative analysis with the reference genome of isolate DH14. This revealed a mosaic genome structure consisting of large isolate-specific DNA blocks with either high or low SNP densities. Some of the highly polymorphic blocks likely accumulated SNPs for over 10,000 years, well before the domestication of barley. These isolate-specific blocks of alternating monomorphic and polymorphic regions imply an exceptionally large standing genetic variation in the Bgh population and might be generated and maintained by rare outbreeding and frequent clonal reproduction. RNA-sequencing experiments with isolates A6 and K1 during four early stages of compatible and incompatible interactions on leaves of partially immunocompromised Arabidopsis mutants revealed a conserved Bgh transcriptional program during pathogenesis compared with the natural host barley despite ∼200 million years of reproductive isolation of these hosts. Transcripts encoding candidate-secreted effector proteins are massively induced in successive waves. A specific decrease in candidate-secreted effector protein transcript abundance in the incompatible interaction follows extensive transcriptional reprogramming of the host transcriptome and coincides with the onset of localized host cell death, suggesting a host-inducible defense mechanism that targets fungal effector secretion or production. PMID:23696672

  4. Insular Celtic population structure and genomic footprints of migration.

    Science.gov (United States)

    Byrne, Ross P; Martiniano, Rui; Cassidy, Lara M; Carrigan, Matthew; Hellenthal, Garrett; Hardiman, Orla; Bradley, Daniel G; McLaughlin, Russell L

    2018-01-01

    Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

  5. Insular Celtic population structure and genomic footprints of migration.

    Directory of Open Access Journals (Sweden)

    Ross P Byrne

    2018-01-01

    Full Text Available Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

  6. seq-seq-pan: building a computational pan-genome data structure on whole genome alignment.

    Science.gov (United States)

    Jandrasits, Christine; Dabrowski, Piotr W; Fuchs, Stephan; Renard, Bernhard Y

    2018-01-15

    The increasing application of next generation sequencing technologies has led to the availability of thousands of reference genomes, often providing multiple genomes for the same or closely related species. The current approach to represent a species or a population with a single reference sequence and a set of variations cannot represent their full diversity and introduces bias towards the chosen reference. There is a need for the representation of multiple sequences in a composite way that is compatible with existing data sources for annotation and suitable for established sequence analysis methods. At the same time, this representation needs to be easily accessible and extendable to account for the constant change of available genomes. We introduce seq-seq-pan, a framework that provides methods for adding or removing new genomes from a set of aligned genomes and uses these to construct a whole genome alignment. Throughout the sequential workflow the alignment is optimized for generating a representative linear presentation of the aligned set of genomes, that enables its usage for annotation and in downstream analyses. By providing dynamic updates and optimized processing, our approach enables the usage of whole genome alignment in the field of pan-genomics. In addition, the sequential workflow can be used as a fast alternative to existing whole genome aligners for aligning closely related genomes. seq-seq-pan is freely available at https://gitlab.com/rki_bioinformatics.

  7. Elucidating the influence of polymorph-dependent interfacial solvent structuring at chitin surfaces.

    Science.gov (United States)

    Brown, Aaron H; Walsh, Tiffany R

    2016-10-20

    Interfacial solvent structuring is thought to be influential in mediating the adsorption of biomolecules at aqueous materials interfaces. However, despite the enormous potential for exploitation of aqueous chitin interfaces in industrial, medical and drug-delivery applications, little is known at the molecular-level about such interfacial solvent structuring for chitin. Here we use molecular simulation to predict the structure of the [100] and [010] interfaces of α-chitin and β-chitin dihydrate in contact with liquid water and saline solution. We find the α-chitin [100] interface supports lateral high-density regions in the first water layer at the interface, which are also present, but not as pronounced, for β-chitin. The lateral structuring of interfacial ions at the saline/chitin interface is also more pronounced for α-chitin compared with β-chitin. Our findings provide a foundation for the systematic design of biomolecules with selective binding affinity for different chitin polymorphs. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Genome-wide association study identifies polymorphisms associated with the analgesic effect of fentanyl in the preoperative cold pressor-induced pain test

    Directory of Open Access Journals (Sweden)

    Kaori Takahashi

    2018-03-01

    Full Text Available Opioid analgesics are widely used for the treatment of moderate to severe pain. The analgesic effects of opioids are well known to vary among individuals. The present study focused on the genetic factors that are associated with interindividual differences in pain and opioid sensitivity. We conducted a multistage genome-wide association study in subjects who were scheduled to undergo mandibular sagittal split ramus osteotomy and were not medicated until they received fentanyl for the induction of anesthesia. We preoperatively conducted the cold pressor-induced pain test before and after fentanyl administration. The rs13093031 and rs12633508 single-nucleotide polymorphisms (SNPs near the LOC728432 gene region and rs6961071 SNP in the tcag7.1213 gene region were significantly associated with the analgesic effect of fentanyl, based on differences in pain perception latency before and after fentanyl administration. The associations of these three SNPs that were identified in our exploratory study have not been previously reported. The two polymorphic loci (rs13093031 and rs12633508 were shown to be in strong linkage disequilibrium. Subjects with the G/G genotype of the rs13093031 and rs6961071 SNPs presented lower fentanyl-induced analgesia. Our findings provide a basis for investigating genetics-based analgesic sensitivity and personalized pain control. Keywords: Opioid sensitivity, Analgesia, Fentanyl, Polymorphism, GWAS

  9. Evaluation of a microarray-hybridization based method applicable for discovery of single nucleotide polymorphisms (SNPs) in the Pseudomonas aeruginosa genome

    Science.gov (United States)

    Dötsch, Andreas; Pommerenke, Claudia; Bredenbruch, Florian; Geffers, Robert; Häussler, Susanne

    2009-01-01

    Background Whole genome sequencing techniques have added a new dimension to studies on bacterial adaptation, evolution and diversity in chronic infections. By using this powerful approach it was demonstrated that Pseudomonas aeruginosa undergoes intense genetic adaptation processes, crucial in the development of persistent disease. The challenge ahead is to identify universal infection relevant adaptive bacterial traits as potential targets for the development of alternative treatment strategies. Results We developed a microarray-based method applicable for discovery of single nucleotide polymorphisms (SNPs) in P. aeruginosa as an easy and economical alternative to whole genome sequencing. About 50% of all SNPs theoretically covered by the array could be detected in a comparative hybridization of PAO1 and PA14 genomes at high specificity (> 0.996). Variations larger than SNPs were detected at much higher sensitivities, reaching nearly 100% for genetic differences affecting multiple consecutive probe oligonucleotides. The detailed comparison of the in silico alignment with experimental hybridization data lead to the identification of various factors influencing sensitivity and specificity in SNP detection and to the identification of strain specific features such as a large deletion within the PA4684 and PA4685 genes in the Washington Genome Center PAO1. Conclusion The application of the genome array as a tool to identify adaptive mutations, to depict genome organizations, and to identify global regulons by the "ChIP-on-chip" technique will expand our knowledge on P. aeruginosa adaptation, evolution and regulatory mechanisms of persistence on a global scale and thus advance the development of effective therapies to overcome persistent disease. PMID:19152677

  10. Learning directed acyclic graphical structures with genetical genomics data.

    Science.gov (United States)

    Gao, Bin; Cui, Yuehua

    2015-12-15

    Large amount of research efforts have been focused on estimating gene networks based on gene expression data to understand the functional basis of a living organism. Such networks are often obtained by considering pairwise correlations between genes, thus may not reflect the true connectivity between genes. By treating gene expressions as quantitative traits while considering genetic markers, genetical genomics analysis has shown its power in enhancing the understanding of gene regulations. Previous works have shown the improved performance on estimating the undirected network graphical structure by incorporating genetic markers as covariates. Knowing that gene expressions are often due to directed regulations, it is more meaningful to estimate the directed graphical network. In this article, we introduce a covariate-adjusted Gaussian graphical model to estimate the Markov equivalence class of the directed acyclic graphs (DAGs) in a genetical genomics analysis framework. We develop a two-stage estimation procedure to first estimate the regression coefficient matrix by [Formula: see text] penalization. The estimated coefficient matrix is then used to estimate the mean values in our multi-response Gaussian model to estimate the regulatory networks of gene expressions using PC-algorithm. The estimation consistency for high dimensional sparse DAGs is established. Simulations are conducted to demonstrate our theoretical results. The method is applied to a human Alzheimer's disease dataset in which differential DAGs are identified between cases and controls. R code for implementing the method can be downloaded at http://www.stt.msu.edu/∼cui. R code for implementing the method is freely available at http://www.stt.msu.edu/∼cui/software.html. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. AluScan: a method for genome-wide scanning of sequence and structure variations in the human genome

    Directory of Open Access Journals (Sweden)

    Mei Lingling

    2011-11-01

    intergenic sequences with modest capture and sequencing costs, computation workload and DNA sample requirement is particularly well suited for accelerating the discovery of somatic mutations, as well as analysis of disease-predisposing germline polymorphisms, by making possible the comparative genome-wide scanning of DNA sequences from large human cohorts.

  12. Solid state characterization and crystal structure from X-ray powder diffraction of two polymorphic forms of ranitidine base.

    Science.gov (United States)

    de Armas, Héctor Novoa; Peeters, Oswald M; Blaton, Norbert; Van Gyseghem, Elke; Martens, Johan; Van Haele, Gerrit; Van Den Mooter, Guy

    2009-01-01

    Ranitidine hydrochloride (RAN-HCl), a known anti-ulcer drug, is the product of reaction between HCl and ranitidine base (RAN-B). RAN-HCl has been extensively studied; however this is not the case of the RAN-B. The solid state characterization of RAN-B polymorphs has been carried out using different analytical techniques (microscopy, thermal analysis, Fourier transform infrared spectrometry in the attenuated total reflection mode, (13)C-CPMAS-NMR spectroscopy and X-ray powder diffraction). The crystal structures of RAN-B form I and form II have been determined using conventional X-ray powder diffraction in combination with simulated annealing and whole profile pattern matching, and refined using rigid-body Rietveld refinement. RAN-B form I is a monoclinic polymorph with cell parameters: a = 7.317(2), b = 9.021(2), c = 25.098(6) A, beta = 95.690(1) degrees and space group P2(1)/c. The form II is orthorhombic: a = 31.252(4), b = 13.052(2), c = 8.0892(11) A with space group Pbca. In RAN-B polymorphs, the nitro group is involved in a strong intramolecular hydrogen bond responsible for the existence of a Z configuration in the enamine portion of the molecules. A tail to tail packing motif can be denoted via intermolecular hydrogen bonds. The crystal structures of RAN-B forms are compared to those of RAN-HCl polymorphs. RAN-B polymorphs are monotropic polymorphic pairs. (c) 2008 Wiley-Liss, Inc. and the American Pharmacists Association

  13. From structure prediction to genomic screens for novel non-coding RNAs

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.

    2011-01-01

    methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch....... This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early...... upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other....

  14. Producing genome structure populations with the dynamic and automated PGS software.

    Science.gov (United States)

    Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank

    2018-05-01

    Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.

  15. Single nucleotide polymorphism discovery in albacore and Atlantic bluefin tuna provides insights into worldwide population structure.

    Science.gov (United States)

    Albaina, A; Iriondo, M; Velado, I; Laconcha, U; Zarraonaindia, I; Arrizabalaga, H; Pardo, M A; Lutcavage, M; Grant, W S; Estonba, A

    2013-12-01

    The optimal management of the commercially important, but mostly over-exploited, pelagic tunas, albacore (Thunnus alalunga Bonn., 1788) and Atlantic bluefin tuna (BFT; Thunnus thynnus L., 1758), requires a better understanding of population structure than has been provided by previous molecular methods. Despite numerous studies of both species, their population structures remain controversial. This study reports the development of single nucleotide polymorphisms (SNPs) in albacore and BFT and the application of these SNPs to survey genetic variability across the geographic ranges of these tunas. A total of 616 SNPs were discovered in 35 albacore tuna by comparing sequences of 54 nuclear DNA fragments. A panel of 53 SNPs yielded FST values ranging from 0.0 to 0.050 between samples after genotyping 460 albacore collected throughout the distribution of this species. No significant heterogeneity was detected within oceans, but between-ocean comparisons (Atlantic, Pacific and Indian oceans along with Mediterranean Sea) were significant. Additionally, a 17-SNP panel was developed in Atlantic BFT by cross-species amplification in 107 fish. This limited number of SNPs discriminated between samples from the two major spawning areas of Atlantic BFT (FST  = 0.116). The SNP markers developed in this study can be used to genotype large numbers of fish without the need for standardizing alleles among laboratories. © 2013 The Authors, Animal Genetics © 2013 Stichting International Foundation for Animal Genetics.

  16. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  17. Structure of the genome of equine herpesvirus type 3.

    Science.gov (United States)

    Sullivan, D C; Atherton, S S; Staczek, J; O'Callaghan, D J

    1984-01-30

    Restriction endonuclease mapping studies were performed to determine the molecular structure of the genome of equine herpesvirus type 3 (EHV-3). Purified EHV-3 DNA, either unlabeled or 32P-labeled, was analyzed using the restriction enzymes BamHI, BclI, BglII, EcoRI, and HindIII. The findings that four 0.5 M (molar) fragments were present, that two of these were terminal fragments, and that all 0.5 M fragments contained homologous DNA sequences as judged by DNA hybridization analyses indicated that DNA sequences located at one terminus are repeated within the molecule and that two populations of molecules exist with regard to the arrangement of this pair of shared sequences. Mapping of BamHI, BclI, BglII, EcoRI, and HindIII fragments by double digestion of intact EHV-3 DNA, reciprocal digestion of isolated restriction enzyme fragments, and blot hybridization experiments revealed that the EHV-3 genome is a linear, double-stranded DNA molecule with a molecular size of 96.2 +/- 0.48 MDa and is comprised of two covalently linked segments, designated L (long) and S (short). The S region is approximately 22.9 MDa in size and consists of a unique segment (Us) of approximately 5.8 MDa bracketed by 8.5 MDa inverted repeat sequences that allow the S region to invert relative to the fixed L region which is approximately 73.3 MDa in size and consists only of unique sequences. Thus, these data confirm that EHV-3 DNA exists in two isomeric forms and has a molecular structure similar to that of the genomes of EHV-1 (B. E. Henry, S. A. Robinson, S. A. Dauenhauer, S. S. Atherton, G. S. Hayward, and D. J. O'Callaghan, Virology 115, 97-114, 1981; D. J. O'Callaghan, G. A. Gentry, and C. C. Randall, "The Herpesvirus," Vol. 2, pp. 215-318, Plenum, New York, 1983; D. J. O'Callaghan, B. E. Henry, J. H. Wharton, S. A. Dauenhauer, R. B. Vance, J. Staczek, and R. A. Robinson, "Developments in Molecular Virology," Vol. 1, pp. 387-418, Nijhoff, The Hague, 1981; W. T. Ruyechan, S. A. Dauenhauer

  18. Genome-wide characterization of microsatellites in Cucumis hystrix and in silico identification of polymorphic SSR markers

    Science.gov (United States)

    Cucumis hystrix (2n = 2x = 24, genome HH) is a wild relative of cucumber (C. sativus L., 2n = 2x = 14) that possesses multiple disease resistances and has a great potential for cucumber improvement. Despite its importance, there is no genomic resource currently available for C. hystrix. To expedite ...

  19. Human glutamate pyruvate transaminase (GPT): Localization to 8q24.3, cDNA and genomic sequences, and polymorphic sites

    Energy Technology Data Exchange (ETDEWEB)

    Sohocki, M.M.; Sullivan, L.S.; Daiger, S.P. [Univ. of Texas Health Science Center, Houston, TX (United States)] [and others

    1997-03-01

    Two frequent protein variants of glutamate pyruvate transaminase (GPT) (E.C.2.6.1.2) have been used as genetic markers in humans for more than two decades, although chromosomal mapping of the GPT locus in the 1980s produced conflicting results. To resolve this conflict and develop useful DNA markers for this gene, we isolated and characterized cDNA and genomic clones of GPT. We have definitively mapped human GPT to the terminus of 8q using several methods. First, two cosmids shown to contain the GPT sequence were derived from a chromosome 8-specific library. Second, by fluorescence in situ hybridization, we mapped the cosmid containing the human GPT gene to chromosome band 8q24.3. Third, we mapped the rat gpt cDNA to the syntenic region of rat chromosome 7. Finally, PCR primers specific to human GPT amplify sequences contained within a {open_quotes}half-YAC{close_quotes} from the long arm of chromosome 8, that is, a YAC containing the 8q telomere. The human GPT genomic sequence spans 2.7 kb and consists of 11 exons, ranging in size from 79 to 243 bp. The exonic sequence encodes a protein of 495 amino acids that is nearly identical to the previously reported protein sequence of human GPT-1. The two polymorphic GPT isozymes are the result of a nucleotide substitution in codon 14. In addition, a cosmid containing the GPT sequence also contains a previously unmapped, polymorphic microsatellite sequence, D8S421. The cloned GPT gene and associated polymorphisms will be useful for linkage and physical mapping of disease loci that map to the terminus of 8q, including atypical vitelliform macular dystrophy (VMD1) and epidermolysis bullosa simplex, type Ogna (EBS1). In addition, this will be a useful system for characterizing the telomeric region of 8q. Finally, determination of the molecular basis of the GPT isozyme variants will permit PCR-based detection of this world-wide polymorphism. 22 refs., 3 figs.

  20. Genetic Structure of the Polymorphic Metrosideros (Myrtaceae) Complex in the Hawaiian Islands Using Nuclear Microsatellite Data

    Science.gov (United States)

    Harbaugh, Danica T.; Wagner, Warren L.; Percy, Diana M.; James, Helen F.; Fleischer, Robert C.

    2009-01-01

    Background Five species of Metrosideros (Myrtaceae) are recognized in the Hawaiian Islands, including the widespread M. polymorpha, and are characterized by a multitude of distinctive, yet overlapping, habit, ecological, and morphological forms. It remains unclear, despite several previous studies, whether the morphological variation within Hawaiian Metrosideros is due to hybridization, genetic polymorphism, phenotypic plasticity, or some combination of these processes. The Hawaiian Metrosideros complex has become a model system to study ecology and evolution; however this is the first study to use microsatellite data for addressing inter-island patterns of variation from across the Hawaiian Islands. Methodology/Principal Findings Ten nuclear microsatellite loci were genotyped from 143 individuals of Metrosideros. We took advantage of the bi-parental inheritance and rapid mutation rate of these data to examine the validity of the current taxonomy and to investigate whether Metrosideros plants from the same island are more genetically similar than plants that are morphologically similar. The Bayesian algorithm of the program structure was used to define genetic groups within Hawaiian Metrosideros and the closely related taxon M. collina from the Marquesas and Austral Islands. Several standard and nested AMOVAs were conducted to test whether the genetic diversity is structured geographically or taxonomically. Conclusions/Significance The results suggest that Hawaiian Metrosideros have dynamic gene flow, with genetic and morphological diversity structured not simply by geography or taxonomy, but as a result of parallel evolution on islands following rampant island-island dispersal, in addition to ancient chloroplast capture. Results also suggest that the current taxonomy requires major revisions in order to reflect the genetic structure revealed in the microsatellite data. PMID:19259272

  1. Genetic structure of the polymorphic metrosideros (Myrtaceae) complex in the Hwaiian islands using nuclear microsatellite data.

    Science.gov (United States)

    Harbaugh, Danica T; Wagner, Warren L; Percy, Diana M; James, Helen F; Fleischer, Robert C

    2009-01-01

    Five species of Metrosideros (Myrtaceae) are recognized in the Hawaiian Islands, including the widespread M. polymorpha, and are characterized by a multitude of distinctive, yet overlapping, habit, ecological, and morphological forms. It remains unclear, despite several previous studies, whether the morphological variation within Hawaiian Metrosideros is due to hybridization, genetic polymorphism, phenotypic plasticity, or some combination of these processes. The Hawaiian Metrosideros complex has become a model system to study ecology and evolution; however this is the first study to use microsatellite data for addressing inter-island patterns of variation from across the Hawaiian Islands. Ten nuclear microsatellite loci were genotyped from 143 individuals of Metrosideros. We took advantage of the bi-parental inheritance and rapid mutation rate of these data to examine the validity of the current taxonomy and to investigate whether Metrosideros plants from the same island are more genetically similar than plants that are morphologically similar. The Bayesian algorithm of the program structure was used to define genetic groups within Hawaiian Metrosideros and the closely related taxon M. collina from the Marquesas and Austral Islands. Several standard and nested AMOVAs were conducted to test whether the genetic diversity is structured geographically or taxonomically. The results suggest that Hawaiian Metrosideros have dynamic gene flow, with genetic and morphological diversity structured not simply by geography or taxonomy, but as a result of parallel evolution on islands following rampant island-island dispersal, in addition to ancient chloroplast capture. Results also suggest that the current taxonomy requires major revisions in order to reflect the genetic structure revealed in the microsatellite data.

  2. Genetic structure of the polymorphic metrosideros (Myrtaceae complex in the Hwaiian islands using nuclear microsatellite data.

    Directory of Open Access Journals (Sweden)

    Danica T Harbaugh

    Full Text Available Five species of Metrosideros (Myrtaceae are recognized in the Hawaiian Islands, including the widespread M. polymorpha, and are characterized by a multitude of distinctive, yet overlapping, habit, ecological, and morphological forms. It remains unclear, despite several previous studies, whether the morphological variation within Hawaiian Metrosideros is due to hybridization, genetic polymorphism, phenotypic plasticity, or some combination of these processes. The Hawaiian Metrosideros complex has become a model system to study ecology and evolution; however this is the first study to use microsatellite data for addressing inter-island patterns of variation from across the Hawaiian Islands.Ten nuclear microsatellite loci were genotyped from 143 individuals of Metrosideros. We took advantage of the bi-parental inheritance and rapid mutation rate of these data to examine the validity of the current taxonomy and to investigate whether Metrosideros plants from the same island are more genetically similar than plants that are morphologically similar. The Bayesian algorithm of the program structure was used to define genetic groups within Hawaiian Metrosideros and the closely related taxon M. collina from the Marquesas and Austral Islands. Several standard and nested AMOVAs were conducted to test whether the genetic diversity is structured geographically or taxonomically.The results suggest that Hawaiian Metrosideros have dynamic gene flow, with genetic and morphological diversity structured not simply by geography or taxonomy, but as a result of parallel evolution on islands following rampant island-island dispersal, in addition to ancient chloroplast capture. Results also suggest that the current taxonomy requires major revisions in order to reflect the genetic structure revealed in the microsatellite data.

  3. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper

    Directory of Open Access Journals (Sweden)

    Abinaya Manivannan

    2018-01-01

    Full Text Available Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  4. A genome-wide association study for milk production traits in Danish Jersey cattle using a 50K single nucleotide polymorphism chip

    DEFF Research Database (Denmark)

    Mai, Duy Minh; Sahana, Goutam; Christiansen, Freddy

    2010-01-01

    Quantitative trait loci for milk production traits in Danish Jersey cattle were mapped by a genome-wide association analysis using a mixed model. The analysis incorporated 1,039 bulls and 33,090 SNP and resulted in 98 detected combinations of QTL and traits on 27 BTA. These QTL comprised 30...... for milk index, 50 for fat index, and 18 for protein index. The evidence presents 33 genome-wide QTL on 14 BTA. Of these, 7 had effects on milk index, 21 on fat index, and 5 on protein index. Among the genome-wide QTL, 26 have been previously reported, 2 on BTA4 and BTA5 were new for milk index, and 5......-like kinase 4. By a chromosome-wide threshold, 65 additional QTL were detected. Many of them are likely to represent QTL. The results are interesting from a breeding perspective and contribute to the search for the genes causing the polymorphisms important for milk production traits....

  5. Alignment-free comparative genomic screen for structured RNAs using coarse-grained secondary structure dot plots

    DEFF Research Database (Denmark)

    Kato, Yuki; Gorodkin, Jan; Havgaard, Jakob Hull

    2017-01-01

    . Methods: Here we present a fast and efficient method, DotcodeR, for detecting structurally similar RNAs in genomic sequences by comparing their corresponding coarse-grained secondary structure dot plots at string level. This allows us to perform an all-against-all scan of all window pairs from two genomes...... without alignment. Results: Our computational experiments with simulated data and real chromosomes demonstrate that the presented method has good sensitivity. Conclusions: DotcodeR can be useful as a pre-filter in a genomic comparative scan for structured RNAs....

  6. Defining the genome structure of 'Tongil' rice, an important cultivar in the Korean "Green Revolution".

    Science.gov (United States)

    Kim, Backki; Kim, Dong-Gwan; Lee, Gileung; Seo, Jeonghwan; Choi, Ik-Young; Choi, Beom-Soon; Yang, Tae-Jin; Kim, Kwang Soo; Lee, Joohyun; Chin, Joong Hyoun; Koh, Hee-Jong

    2014-12-01

    Tongil (IR667-98-1-2) rice, developed in 1972, is a high-yield rice variety derived from a three-way cross between indica and japonica varieties. Tongil contributed to the self-sufficiency of staple food production in Korea during a period known as the 'Korean Green Revolution'. We analyzed the nucleotide-level genome structure of Tongil rice and compared it to those of the parental varieties. A total of 17.3 billion Illumina Hiseq reads, 47× genome coverage, were generated for Tongil rice. Three parental accessions of Tongil rice, two indica types and one japonica type, were also sequenced at approximately 30x genome coverage. A total of 2,149,991 SNPs were detected between Tongil and Nipponbare varieties. The average SNP frequency of Tongil was 5.77 per kb. Genome composition was determined based on SNP data by comparing Tongil with three parental genome sequences using the sliding window approach. Analyses revealed that 91.8% of the Tongil genome originated from the indica parents and 7.9% from the japonica parent. Copy numbers of SSR motifs, ORF gene distribution throughout the whole genome, gene ontology (GO) annotation, and some yield-related QTLs or gene locations were also comparatively analyzed between Tongil and parental varieties using sequence-based tools. Each genetic factor was transferred from the parents into Tongil rice in amounts that were in proportion to the whole genome composition. Tongil was derived from a three-way cross among two indica and one japonica varieties. Defining the genome structure of Tongil rice demonstrates that the Tongil genome is derived primarily from the indica genome with a small proportion of japonica genome introgression. Comparative gene distribution, SSR, GO, and yield-related gene analysis support the finding that the Tongil genome is primarily made up of the indica genome.

  7. Decoding the fine-scale structure of a breast cancer genome and transcriptome.

    Science.gov (United States)

    Volik, Stanislav; Raphael, Benjamin J; Huang, Guiqing; Stratton, Michael R; Bignel, Graham; Murnane, John; Brebner, John H; Bajsarowicz, Krystyna; Paris, Pamela L; Tao, Quanzhou; Kowbel, David; Lapuk, Anna; Shagin, Dmitri A; Shagina, Irina A; Gray, Joe W; Cheng, Jan-Fang; de Jong, Pieter J; Pevzner, Pavel; Collins, Colin

    2006-03-01

    A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome rearrangements elucidating the structural organization of tumor genomes. Here we extend the ESP methodology in several directions using the breast cancer cell line MCF-7. First, targeted ESP is applied to multiple amplified loci, revealing a complex process of rearrangement and co-amplification in these regions reminiscent of breakage/fusion/bridge cycles. Second, genome breakpoints identified by ESP are confirmed using a combination of DNA sequencing and PCR. Third, in vitro functional studies assign biological function to a rearranged tumor BAC clone, demonstrating that it encodes anti-apoptotic activity. Finally, ESP is extended to the transcriptome identifying four novel fusion transcripts and providing evidence that expression of fusion genes may be common in tumors. These results demonstrate the distinct advantages of ESP including: (1) the ability to detect all types of rearrangements and copy number changes; (2) straightforward integration of ESP data with the annotated genome sequence; (3) immortalization of the genome; (4) ability to generate tumor-specific reagents for in vitro and in vivo functional studies. Given these properties, ESP could play an important role in a tumor genome project.

  8. Crystal structure of a new monoclinic polymorph of N-(4-methylphenyl-3-nitropyridin-2-amine

    Directory of Open Access Journals (Sweden)

    Aina Mardia Akhmad Aznan

    2014-08-01

    Full Text Available The title compound, C12H11N3O2, is a second monoclinic polymorph (P21, with Z′ = 4 of the previously reported monoclinic (P21/c, with Z′ = 2 form [Akhmad Aznan et al. (2010. Acta Cryst. E66, o2400]. Four independent molecules comprise the asymmetric unit, which have the common features of a syn disposition of the pyridine N atom and the toluene ring, and an intramolecular amine–nitro N—H...O hydrogen bond. The differences between molecules relate to the dihedral angles between the rings which range from 2.92 (19 to 26.24 (19°. The geometry-optimized structure [B3LYP level of theory and 6–311 g+(d,p basis set] has the same features except that the entire molecule is planar. In the crystal, the three-dimensional architecture is consolidated by a combination of C—H...O, C—H...π, nitro-N—O...π and π–π interactions [inter-centroid distances = 3.649 (2–3.916 (2 Å].

  9. Moving away from the reference genome: evaluating a peptide sequencing tagging approach for single amino acid polymorphism identifications in the genus Populus.

    Science.gov (United States)

    Abraham, Paul; Adams, Rachel M; Tuskan, Gerald A; Hettich, Robert L

    2013-08-02

    The genetic diversity across natural populations of the model organism, Populus, is extensive, containing a single nucleotide polymorphism roughly every 200 base pairs. When deviations from the reference genome occur in coding regions, they can impact protein sequences. Rather than relying on a static reference database to profile protein expression, we employed a peptide sequence tagging (PST) approach capable of decoding the plasticity of the Populus proteome. Using shotgun proteomics data from two genotypes of P. trichocarpa, a tag-based approach enabled the detection of 6653 unexpected sequence variants. Through manual validation, our study investigated how the most abundant chemical modification (methionine oxidation) could masquerade as a sequence variant (Ala→Ser) when few site-determining ions existed. In fact, precise localization of an oxidation site for peptides with more than one potential placement was indeterminate for 70% of the MS/MS spectra. We demonstrate that additional fragment ions made available by high energy collisional dissociation enhances the robustness of the peptide sequence tagging approach (81% of oxidation events could be exclusively localized to a methionine). We are confident that augmenting fragmentation processes for a PST approach will further improve the identification of single amino acid polymorphism in Populus and potentially other species as well.

  10. Genetic Diversity and Population Structure of Toona Ciliata Roem. Based on Sequence-Related Amplified Polymorphism (SRAP Markers

    Directory of Open Access Journals (Sweden)

    Pei Li

    2015-04-01

    Full Text Available Sequence-related amplified polymorphism (SRAP markers were used to investigate the genetic diversity among 30 populations of Toona ciliata Roem. sampled from the species’ distribution area in China. To analyze the polymorphism in the SRAP profiles, 1505 primer pairs were screened and 24 selected. A total of 656 SRAP bands ranging from 100 to 1500 bp were acquired, of these 505 bands (77% were polymorphic. The polymorphism information content (PIC values ranged from 0.32 to 0.45, with an average of 0.41. An analysis of molecular variance (AMOVA indicated that the most significant variation was attributable to differences among the populations and that variation within the populations was small. STRUCTURE analysis divided the 30 populations into two parts. The unweighted pair group method of arithmetic averages (UPGMA clustering and principal coordinates analysis (PCoA showed that the 30 populations could be classified into four types. The results demonstrate a clear geographical trend for T. ciliata in China and provide a theoretical basis for future breeding and conservation strategy of T. ciliata.

  11. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain.

    Science.gov (United States)

    Sükösd, Zsuzsanna; Andersen, Ebbe S; Seemann, Stefan E; Jensen, Mads Krogh; Hansen, Mathias; Gorodkin, Jan; Kjems, Jørgen

    2015-12-02

    A distance constrained secondary structural model of the ≈10 kb RNA genome of the HIV-1 has been predicted but higher-order structures, involving long distance interactions, are currently unknown. We present the first global RNA secondary structure model for the HIV-1 genome, which integrates both comparative structure analysis and information from experimental data in a full-length prediction without distance constraints. Besides recovering known structural elements, we predict several novel structural elements that are conserved in HIV-1 evolution. Our results also indicate that the structure of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping protein-coding regions the COS is supported by a particular high frequency of compensatory base changes, suggesting functional importance for this element. This new structural element potentially organizes the whole genome into three major domains protruding from a conserved core structure with potential roles in replication and evolution for the virus. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Failure to lyse venous thrombi because of elevated plasminogen activator Inhibitor 1 (PAI-1) and 4G polymorphism of its promotor genome (The PAI-1/4G Syndrome).

    Science.gov (United States)

    Bern, Murray M; McCarthy, Nancy

    2010-10-01

    Plasminogen activator Inhibitor 1 (PAI-1) inhibits plasminogen activators leading to decreased fibrinolysis and increased risk of thromboembolic disease (TED). Shifts in PAI-1 promoter genome from normal 5G>5G to 4G>5G or 4G>4G alleles are associated with overexpression of PAI-1. In this study patients with residual venous thrombi were observed to have increased PAI-1 levels and more frequent shifts to 4G alleles. Of the 26, 20 (76.9%) patients with unresolved thrombus had elevated PAI-1 values. 4G genomic shifts were found in 92.9% patients studied. Normal PAI-1 levels were found in 5 patients with 4G polymorphisms. Thus, PAI-1 is often elevated among patients with residual thrombus, with an unexpectedly high prevalence of the 4G polymorphism of the promoter genome. Patients with persistent thrombus should be considered at risk of having constituently increased PAI-1 due to genomic changes in the PAI-1 promoter genome. Hypotheses are proposed to explain those with normal PAI-1, despite having 4G polymorphisms.

  13. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae

    Science.gov (United States)

    Park, Kyu Tae

    2017-01-01

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops (Colocasia, commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna, Spirodela, Wolffiella, Wolffia, Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus. In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region. PMID:29144427

  14. The Chloroplast Genome of Symplocarpus renifolius: A Comparison of Chloroplast Genome Structure in Araceae.

    Science.gov (United States)

    Choi, Kyoung Su; Park, Kyu Tae; Park, SeonJoo

    2017-11-16

    Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops ( Colocasia , commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna , Spirodela , Wolffiella , Wolffia , Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus . In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region.

  15. Exploration of NVE classical trajectories as a tool for molecular crystal structure prediction, with tests on ice polymorphs

    Science.gov (United States)

    Buch, V.; Martoňák, R.; Parrinello, M.

    2006-05-01

    Following an initial Communication [Buch et al., J. Chem. Phys. 123, 051108 (2005)], a new molecular-dynamics-based approach is explored to search for candidate crystal structures of molecular solids corresponding to minima of the enthalpy. The approach is based on the observation of phase transitions in an artificial periodic system with a small unit cell and relies on the existence of an optimal energy range for observing freezing to low-lying minima in the course of classical trajectories. Tests are carried out for O structures of nine H2O-ice polymorphs. NVE trajectories for a range of preimposed box shapes display freezing to the different crystal polymorphs whenever the box dimensions approximate roughly the appropriate unit cell; the exception is ice II for which freezing requires unit cell dimensions close to the correct ones. In an alternate version of the algorithm, an initial box shape is picked at random and subsequently readjusted at short trajectory intervals by enthalpy minimization. Tests reveal the existence of ice forms which are "difficult" and "easy" to locate in this way. The former include ice IV, which is also difficult to crystallize experimentally from the liquid, and ice II, which does not interface with the liquid in the phase diagram. On the other hand, the latter crystal search procedure located successfully the remaining seven ice polymorphs, including ice V, which corresponds to the most complicated structure of all ice phases, with a monoclinic cell of 28molecules.

  16. Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).

    Science.gov (United States)

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  17. Development and validation of a 20K single nucleotide polymorphism (SNP whole genome genotyping array for apple (Malus × domestica Borkh.

    Directory of Open Access Journals (Sweden)

    Luca Bianco

    Full Text Available High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus. A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs. Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  18. Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

    Energy Technology Data Exchange (ETDEWEB)

    Peng, Jamy C. [Univ. of California, Berkeley, CA (United States)

    2007-01-01

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  19. Dopamine beta-hydroxylase: two polymorphisms in linkage disequilibrium at the structural gene DBH associate with biochemical phenotypic variation.

    Science.gov (United States)

    Cubells, J F; van Kammen, D P; Kelley, M E; Anderson, G M; O'Connor, D T; Price, L H; Malison, R; Rao, P A; Kobayashi, K; Nagatsu, T; Gelernter, J

    1998-05-01

    Levels of the enzyme dopamine beta-hydroxylase (DbetaH) in the plasma and cerebrospinal fluid (CSF) are closely related biochemical phenotypes. Both are under strong genetic control. Linkage and association studies suggest the structural gene encoding DbetaH (locus name, DBH) is a major locus influencing plasma activity of DbetaH. This study examined relationships of DBH genotype determined at two polymorphic sites (a previously described GT repeat, referred to as the DBH STR and a single-base substitution at the 3' end of DBH exon 2, named DBH*444 g/a), to CSF levels of DbetaH protein in European-American schizophrenic patients, and to plasma DbetaH activity in European-American patients with mood or anxiety disorders. We also investigated linkage disequilibrium (LD) between the polymorphisms in the pooled samples from those European-American subjects (n=104). Alleles of DBH*444 g/a were associated with differences in mean values of CSF DbetaH levels. Alleles at both polymorphisms were associated with plasma DbetaH activity. Significant LD was observed between respective alleles with similar apparent influence on biochemical phenotype. Thus, allele A3 of the DBH STR was in positive LD with DBH*444a, and both alleles were associated with lower plasma DbetaH activity. DBH STR allele A4 was in positive LD with DBH*444 g, and both alleles were associated with higher plasma DbetaH activity. The results confirm that DBH is a major quantitative trait locus for plasma DbetaH activity, and provide the first direct evidence that DBH also influences CSF DbetaH levels. Both polymorphisms examined in this study appear to be in LD with one or more functional polymorphisms that mediate the influence of allelic variation at DBH on DbetaH biochemical phenotypic variation

  20. From structure prediction to genomic screens for novel non-coding RNAs.

    Science.gov (United States)

    Gorodkin, Jan; Hofacker, Ivo L

    2011-08-01

    Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  1. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  2. Genome-wide association study identifies single-nucleotide polymorphism in KCNB1 associated with left ventricular mass in humans: The HyperGEN Study

    Directory of Open Access Journals (Sweden)

    Kraemer Rachel

    2009-05-01

    Full Text Available Abstract Background We conducted a genome-wide association study (GWAS and validation study for left ventricular (LV mass in the Family Blood Pressure Program – HyperGEN population. LV mass is a sensitive predictor of cardiovascular mortality and morbidity in all genders, races, and ages. Polymorphisms of candidate genes in diverse pathways have been associated with LV mass. However, subsequent studies have often failed to replicate these associations. Genome-wide association studies have unprecedented power to identify potential genes with modest effects on left LV mass. We describe here a GWAS for LV mass in Caucasians using the Affymetrix GeneChip Human Mapping 100 k Set. Cases (N = 101 and controls (N = 101 were selected from extreme tails of the LV mass index distribution from 906 individuals in the HyperGEN study. Eleven of 12 promising (Q Results Despite the relatively small sample, we identified 12 promising SNPs in the GWAS. Eleven SNPs were successfully genotyped in the validation study of 704 Caucasians and 1467 African Americans; 5 SNPs on chromosomes 5, 12, and 20 were significantly (P ≤ 0.05 associated with LV mass after correction for multiple testing. One SNP (rs756529 is intragenic within KCNB1, which is dephosphorylated by calcineurin, a previously reported candidate gene for LV hypertrophy within this population. Conclusion These findings suggest KCNB1 may be involved in the development of LV hypertrophy in humans.

  3. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  4. Structural constraints in the packaging of bluetongue virus genomic segments

    OpenAIRE

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C.; Roy, Polly

    2014-01-01

    : The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by bioche...

  5. Comparative genomics of 274 Vibrio cholerae genomes reveals mobile functions structuring three niche dimensions

    NARCIS (Netherlands)

    Dutilh, Bas E; Thompson, Cristiane C; Vicente, Ana C P; Marin, Michel A; Lee, Clarence; Silva, Genivaldo G Z; Schmieder, Robert; Andrade, Bruno G N; Chimetto, Luciane; Cuevas, Daniel; Garza, Daniel R; Okeke, Iruka N; Aboderin, Aaron Oladipo; Spangler, Jessica; Ross, Tristen; Dinsdale, Elizabeth A; Thompson, Fabiano L; Harkins, Timothy T; Edwards, Robert A

    2014-01-01

    BACKGROUND: Vibrio cholerae is a globally dispersed pathogen that has evolved with humans for centuries, but also includes non-pathogenic environmental strains. Here, we identify the genomic variability underlying this remarkable persistence across the three major niche dimensions space, time, and

  6. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    Science.gov (United States)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  7. A map of single nucleotide polymorphisms of the date palm (Phoenix dactylifera) based on whole genome sequencing of 62 varieties

    Science.gov (United States)

    Date palm is one of the few crop species that thrive in arid environments and are the most significant fruit crop in the Middle East and North Africa, but lacks genomic resources that can accelerate breeding efforts. Here, we present the first comprehensive catalogue of ~12 million common single nuc...

  8. Prevalence of IFNL3 gene polymorphism among blood donors and its relation to genomic profile of ancestry in Brazil.

    Science.gov (United States)

    Rizzo, Silvia Renata Cornelio Parolin; Gazito, Diana; Pott-Junior, Henrique; Latini, Flavia Roche Moreira; Castelo, Adauto

    The recent development of interferon-free regimens based on direct-acting antivirals for the treatment of chronic hepatitis C virus infection has benefited many but not all patients. Some patients still experience treatment failure, possibly attributed to unknown host and viral factors, such as IFNL3 gene polymorphism. The present study assessed the prevalence of rs12979860-CC, rs12979860-CT, and rs12979860-TT genotypes of the IFNL3 gene, and its relationship with ancestry informative markers in 949 adult Brazilian healthy blood donors. Race was analyzed using ancestry informative markers as a surrogate for ancestry. IFNL3 gene was genotyped using the ABI TaqMan single nucleotide polymorphisms genotyping assays. The overall frequency of rs12979860-CC genotype was 36.9%. The contribution of African ancestry was significantly higher among donors from the northeast region in relation to southeast donors, whereas the influence of European ancestry was significantly higher in southeast donors. Donors with rs12979860-CC and rs12979860-CT genotypes had similar ancestry background. The contribution of African ancestry was higher among rs12979860-TT genotype donors in comparison to both rs12979860-CC and rs12979860-CT genotypes. The prevalence of rs12979860-CC genotype is similar to that found in the US, despite the Brazilian ancestry informative markers admixture. However, in terms of ancestry, rs12979860-CT genotype was much closer to rs12979860-CC individuals than to rs12979860-TT. Copyright © 2016 Sociedade Brasileira de Infectologia. Published by Elsevier Editora Ltda. All rights reserved.

  9. Prevalence of IFNL3 gene polymorphism among blood donors and its relation to genomic profile of ancestry in Brazil

    Directory of Open Access Journals (Sweden)

    Silvia Renata Cornelio Parolin Rizzo

    2016-11-01

    Full Text Available The recent development of interferon-free regimens based on direct-acting antivirals for the treatment of chronic hepatitis C virus infection has benefited many but not all patients. Some patients still experience treatment failure, possibly attributed to unknown host and viral factors, such as IFNL3 gene polymorphism. The present study assessed the prevalence of rs12979860-CC, rs12979860-CT, and rs12979860-TT genotypes of the IFNL3 gene, and its relationship with ancestry informative markers in 949 adult Brazilian healthy blood donors. Race was analyzed using ancestry informative markers as a surrogate for ancestry. IFNL3 gene was genotyped using the ABI TaqMan single nucleotide polymorphisms genotyping assays. The overall frequency of rs12979860-CC genotype was 36.9%. The contribution of African ancestry was significantly higher among donors from the northeast region in relation to southeast donors, whereas the influence of European ancestry was significantly higher in southeast donors. Donors with rs12979860-CC and rs12979860-CT genotypes had similar ancestry background. The contribution of African ancestry was higher among rs12979860-TT genotype donors in comparison to both rs12979860-CC and rs12979860-CT genotypes. The prevalence of rs12979860-CC genotype is similar to that found in the US, despite the Brazilian ancestry informative markers admixture. However, in terms of ancestry, rs12979860-CT genotype was much closer to rs12979860-CC individuals than to rs12979860-TT.

  10. Isolation and Characterization of 13 New Polymorphic Microsatellite Markers in the Phaseolus vulgaris L. (Common Bean Genome

    Directory of Open Access Journals (Sweden)

    Aihua Wang

    2012-09-01

    Full Text Available In this study, 13 polymorphic microsatellite markers were isolated from the Phaseolus vulgaris L. (common bean by using the Fast Isolation by AFLP of Sequence COntaining Repeats (FIASCO protocol. These markers revealed two to seven alleles, with an average of 3.64 alleles per locus. The polymorphic information content (PIC values ranged from 0.055 to 0.721 over 13 loci, with a mean value of 0.492, and 7 loci having PIC greater than 0.5. The expected heterozygosity (HE and observed heterozygosity (HO levels ranged from 0.057 to 0.814 and from 0.026 to 0.531, respectively. Cross-species amplification of the 13 prime pairs was performed in its related specie of Vigna unguiculata L. Seven out of all these markers showed cross-species transferability. These markers will be useful for future genetic diversity and population genetics studies for this agricultural specie and its related species.

  11. Studying Cattle Genomic Structural Variations in the Green Economy Era

    Science.gov (United States)

    Transgenic cattle carrying multiple genomic modifications have been produced by serial rounds of somatic cell chromatin transfer (cloning) of sequentially genetically targeted somatic cells. However, cloning efficiency tends to decline with the increase of rounds of cloning. It is possible that mult...

  12. Determination of the crystal and magnetic structure of the DyCrO4-scheelite polymorph by neutron diffraction

    OpenAIRE

    Santos Garcia, Antonio Juan dos; Climent Pascual, Esteban; Rabie, Mahmoud Gamal; Romero de Paz, Julio; Gallardo Amores, Jose Manuel; Khalyavin, Dmitry; Saez Puche, Regino

    2014-01-01

    Neutron diffraction data of DyCrO4 oxide, prepared at 4 GPa and 833 K from the ambient pressure zircon-type, reveal that crystallize with the scheelite-type structure, space group I41/a. Accompanying this structural phase transition induced by pressure the magnetic properties change dramatically from ferromagnetism in the case of zircon to antiferromagnetism for the scheelite polymorph with a T N= 19 K. The analysis of the neutron diffraction data obtained at 1.2 K has been used to d...

  13. Population Structure Analysis of Bull Genomes of European and Western Ancestry

    DEFF Research Database (Denmark)

    Chung, Neo Christopher; Szyda, Joanna; Frąszczak, Magdalena

    2017-01-01

    for individual-specific allele frequencies that directly capture a wide range of complex structure from genome-wide genotypes. As measured by magnitude of differentiation, selection pressure on SNPs within genes is substantially greater than that on intergenic regions. Additionally, broad regions of chromosome 6...... harboring largest genetic differentiation suggest positive selection underlying population structure. We carried out gene set analysis using SNP annotations to identify enriched functional categories such as energy-related processes and multiple development stages. Our population structure analysis of bull...... genomes can support genetic management strategies that capture structural complexity and promote sustainable genetic breadth....

  14. Integrated genome-wide association, coexpression network, and expression single nucleotide polymorphism analysis identifies novel pathway in allergic rhinitis

    Science.gov (United States)

    2014-01-01

    Background Allergic rhinitis is a common disease whose genetic basis is incompletely explained. We report an integrated genomic analysis of allergic rhinitis. Methods We performed genome wide association studies (GWAS) of allergic rhinitis in 5633 ethnically diverse North American subjects. Next, we profiled gene expression in disease-relevant tissue (peripheral blood CD4+ lymphocytes) collected from subjects who had been genotyped. We then integrated the GWAS and gene expression data using expression single nucleotide (eSNP), coexpression network, and pathway approaches to identify the biologic relevance of our GWAS. Results GWAS revealed ethnicity-specific findings, with 4 genome-wide significant loci among Latinos and 1 genome-wide significant locus in the GWAS meta-analysis across ethnic groups. To identify biologic context for these results, we constructed a coexpression network to define modules of genes with similar patterns of CD4+ gene expression (coexpression modules) that could serve as constructs of broader gene expression. 6 of the 22 GWAS loci with P-value ≤ 1x10−6 tagged one particular coexpression module (4.0-fold enrichment, P-value 0.0029), and this module also had the greatest enrichment (3.4-fold enrichment, P-value 2.6 × 10−24) for allergic rhinitis-associated eSNPs (genetic variants associated with both gene expression and allergic rhinitis). The integrated GWAS, coexpression network, and eSNP results therefore supported this coexpression module as an allergic rhinitis module. Pathway analysis revealed that the module was enriched for mitochondrial pathways (8.6-fold enrichment, P-value 4.5 × 10−72). Conclusions Our results highlight mitochondrial pathways as a target for further investigation of allergic rhinitis mechanism and treatment. Our integrated approach can be applied to provide biologic context for GWAS of other diseases. PMID:25085501

  15. Porcine perilipin (PLIN) gene: Structure, polymorphism and association study in Large White Pigs

    Czech Academy of Sciences Publication Activity Database

    Vykoukalová, Z.; Knoll, Aleš; Čepica, Stanislav

    2009-01-01

    Roč. 54, č. 8 (2009), s. 359-364 ISSN 1212-1819 Institutional research plan: CEZ:AV0Z50450515 Keywords : pigs * perilipin * polymorphism Subject RIV: GI - Animal Husbandry ; Breeding Impact factor: 1.008, year: 2009

  16. Associations of POU1F1 gene polymorphisms and protein structure ...

    Indian Academy of Sciences (India)

    milk yield, litter size and body weight (Lan et al. 2007b, c). Further, in ovine species several polymorphisms have been identified in the recent years which showed no relationship with milk traits (Mura et al. 2012). There are few studies on POU1F1 in sheep breeds (Bastos et al. 2006; Mura et al. 2012) but no publications are ...

  17. Associations of POU1F1 gene polymorphisms and protein structure ...

    Indian Academy of Sciences (India)

    the DNA. Polymerase chain reaction (PCR), single-strand conformation polymorphism (SSCP) and sequence analyses were carried out to examine the exon 3 of POU1F1 to high- light possible SNPs. ... POU1F1 gene mutations by PCR-SSCP and DNA sequenc- ... of 4 μL PCR products were mixed with 12 μL denaturing.

  18. RNA 3D modules in genome-wide predictions of RNA 2D structure

    DEFF Research Database (Denmark)

    Theis, Corinna; Zirbel, Craig L; Zu Siederdissen, Christian Höner

    2015-01-01

    Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational...... approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution....... These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D...

  19. Genomic diversity of Mycobacterium tuberculosis Beijing strains isolated in Tuscany, Italy, based on large sequence deletions, SNPs in putative DNA repair genes and MIRU-VNTR polymorphisms.

    Science.gov (United States)

    Garzelli, Carlo; Lari, Nicoletta; Rindi, Laura

    2016-03-01

    The Beijing genotype of Mycobacterium tuberculosis is cause of global concern as it is rapidly spreading worldwide, is considered hypervirulent, and is most often associated to massive spread of MDR/XDR TB, although these epidemiological or pathological properties have not been confirmed for all strains and in all geographic settings. In this paper, to gain new insights into the biogeographical heterogeneity of the Beijing family, we investigated a global sample of Beijing strains (22% from Italian-born, 78% from foreign-born patients) by determining large sequence polymorphism of regions RD105, RD181, RD150 and RD142, single nucleotide polymorphism of putative DNA repair genes mutT4 and mutT2 and MIRU-VNTR profiles based on 11 discriminative loci. We found that, although our sample of Beijing strains showed a considerable genomic heterogeneity, yielding both ancient and recent phylogenetic strains, the prevalent successful Beijing subsets were characterized by deletions of RD105 and RD181 and by one nucleotide substitution in one or both mutT genes. MIRU-VNTR analysis revealed 47 unique patterns and 9 clusters including a total of 33 isolates (41% of total isolates); the relatively high proportion of Italian-born Beijing TB patients, often occurring in mixed clusters, supports the possibility of an ongoing cross-transmission of the Beijing genotype to autochthonous population. High rates of extra-pulmonary localization and drug-resistance, particularly MDR, frequently reported for Beijing strains in other settings, were not observed in our survey. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Structural genomic variation as risk factor for idiopathic recurrent miscarriage

    DEFF Research Database (Denmark)

    Nagirnaja, Liina; Palta, Priit; Kasak, Laura

    2014-01-01

    Recurrent miscarriage (RM) is a multifactorial disorder with acknowledged genetic heritability that affects ∼3% of couples aiming at childbirth. As copy number variants (CNVs) have been shown to contribute to reproductive disease susceptibility, we aimed to describe genome-wide profile of CNVs an...... similar low duplication prevalence worldwide (0.7%-1.2%) compared to RM cases of this study (6.6%-7.5%). The CNV disrupts PDZD2 and GOLPH3 genes predominantly expressed in placenta and it may represent a novel risk factor for pregnancy complications....... and identify common rearrangements modulating risk to RM. Genome-wide screening of Estonian RM patients and fertile controls identified excessive cumulative burden of CNVs (5.4 and 6.1 Mb per genome) in two RM cases possibly increasing their individual disease risk. Functional profiling of all rearranged genes...... within RM study group revealed significant enrichment of loci related to innate immunity and immunoregulatory pathways essential for immune tolerance at fetomaternal interface. As a major finding, we report a multicopy duplication (61.6 kb) at 5p13.3 conferring increased maternal risk to RM in Estonia...

  1. Structure and genome organization of AFV2, a novel archaeal lipothrixvirus with unusual terminal and core structures

    DEFF Research Database (Denmark)

    Häring, Monika; Vestergaard, Gisle Alberg; Brügger, Kim

    2005-01-01

    A novel filamentous virus, AFV2, from the hyperthermophilic archaeal genus Acidianus shows structural similarity to lipothrixviruses but differs from them in its unusual terminal and core structures. The double-stranded DNA genome contains 31,787 bp and carries eight open reading frames homologous...

  2. A genome-wide association study identifies novel single nucleotide polymorphisms associated with dermal shank pigmentation in chickens.

    Science.gov (United States)

    Li, Guangqi; Li, Dongfeng; Yang, Ning; Qu, Lujiang; Hou, Zhuocheng; Zheng, Jiangxia; Xu, Guiyun; Chen, Sirui

    2014-12-01

    Shank color of domestic chickens varies from black to blue, green, yellow, or white, which is controlled by the combination of melanin and xanthophylls in dermis and epidermis. Dermal shank pigmentation of chickens is determined by sex-linked inhibitor of dermal melanin (Id), which is located on the distal end of the long arm of Z chromosome, through controlling dermal melanin pigmentation. Although previous studies have focused on the identification of Id and the linear relationship with barring and recessive white skin, no causal mutations have yet been identified in relation to the mutant dermal pigment inhibiting allele at the Id locus. In this study, we first used the 600K Affymetrix Axiom HD genotyping array, which includes ~580,961 SNP of which 26,642 SNP were on the Z chromosome to perform a genome-wide association study on pure lines of 19 Tibetan hens with dermal pigmentation shank and 21 Tibetan hens with yellow shank to refine the Id location. Association analysis was conducted by the PLINK software using the standard chi-squared test, and then Bonferroni correction was used to adjust multiple testing. The genome-wide study revealed that 3 SNP located at 78.5 to 79.2 Mb on the Z chromosome in the current assembly of chicken genome (galGal4) were significantly associated with dermal shank pigmentation of chickens, but none of them were located in known genes. The interval we refined was partly converged with previous results, suggesting that the Id gene is in or near our refined genome region. However, the genomic context of this region was complex. There were only 15 SNP markers developed by the genotyping array within the interval region, in which only 1 SNP marker passed quality control. Additionally, there were about 5.8-Mb gaps on both sides of the refined interval. The follow-up replication studies may be needed to further confirm the functional significance for these newly identified SNP. ©2014 Poultry Science Association Inc.

  3. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    Directory of Open Access Journals (Sweden)

    Franck Curk

    Full Text Available Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105 were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species

  4. G2S: A web-service for annotating genomic variants on 3D protein structures.

    Science.gov (United States)

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-01-27

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that support programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design conception and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    Energy Technology Data Exchange (ETDEWEB)

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  6. Accuracy of genomic prediction using an evenly spaced, low-density single nucleotide polymorphism panel in broiler chickens.

    Science.gov (United States)

    Wang, C; Habier, D; Peiris, B L; Wolc, A; Kranis, A; Watson, K A; Avendano, S; Garrick, D J; Fernando, R L; Lamont, S J; Dekkers, J C M

    2013-07-01

    One approach for cost-effective implementation of genomic selection is to genotype training individuals with a high-density (HD) panel and selection candidates with an evenly spaced, low-density (ELD) panel. The purpose of this study was to evaluate the extent to which the ELD approach reduces the accuracy of genomic estimated breeding values (GEBV) in a broiler line, in which 1,091 breeders from 3 generations were used for training and 160 progeny of the third generation for validation. All birds were genotyped with an Illumina Infinium platform HD panel that included 20,541 segregating markers. Two subsets of HD markers, with 377 (ELD-1) or 766 (ELD-2) markers, were selected as ELD panels. The ELD-1 panel was genotyped using KBiosciences KASPar SNP genotyping chemistry, whereas the ELD-2 panel was simulated by adding markers from the HD panel to the ELD-1 panel. The training data set was used for 2 traits: BW at 35 d on both sexes and hen house production (HHP) between wk 28 and 54. Methods Bayes-A, -B, -C and genomic best linear unbiased prediction were used to estimate HD-marker effects. Two scenarios were used: (1) the 160 progeny were ELD-genotyped, and (2) the 160 progeny and their dams (117 birds) were ELD-genotyped. The missing HD genotypes in ELD-genotyped birds were imputed by a Gibbs sampler, capitalizing on linkage within families. In scenario (1), the correlation of GEBV for BW (HHP) of the 160 progeny based on observed HD versus imputed genotypes was greater than 0.94 (0.98) with the ELD-1 panel and greater than 0.97 (0.99) with the ELD-2 panel. In scenario (2), the correlation of GEBV for BW (HHP) was greater than 0.92 (0.96) with the ELD-1 panel and greater than 0.95 (0.98) with the ELD-2 panel. Hence, in a pedigreed population, genomic selection can be implemented by genotyping selection candidates with about 400 ELD markers with less than 6% loss in accuracy. This leads to substantial savings in genotyping costs, with little sacrifice in accuracy.

  7. Structures of mono-unsaturated triacylglycerols. V. The β'1-2, β'-3 and β2-3 polymorphs of 1,3-dilauroyl-2-oleoylglycerol (LaOLa) from synchrotron and laboratory powder diffraction data

    NARCIS (Netherlands)

    van Mechelen, J.B.; Goubitz, K.; Pop, M.; Peschar, R.; Schenk, H.

    2008-01-01

    The crystal structures of the β'1-2, the β'-3 and the β2-3 polymorphs of 1,3-dilauroyl-2-oleoylglycerol have been solved from powder diffraction data. The packing of the β2-3 polymorph is similar to that of other cis mono-unsaturated triacylglycerols. Both the β' polymorphs are crystallized in a

  8. The Role of Genetic Polymorphisms as Related to One-Carbon Metabolism, Vitamin B6, and Gene–Nutrient Interactions in Maintaining Genomic Stability and Cell Viability in Chinese Breast Cancer Patients

    Directory of Open Access Journals (Sweden)

    Xiayu Wu

    2016-06-01

    Full Text Available Folate-mediated one-carbon metabolism (FMOCM is linked to DNA synthesis, methylation, and cell proliferation. Vitamin B6 (B6 is a cofactor, and genetic polymorphisms of related key enzymes, such as serine hydroxymethyltransferase (SHMT, methionine synthase reductase (MTRR, and methionine synthase (MS, in FMOCM may govern the bioavailability of metabolites and play important roles in the maintenance of genomic stability and cell viability (GSACV. To evaluate the influences of B6, genetic polymorphisms of these enzymes, and gene–nutrient interactions on GSACV, we utilized the cytokinesis-block micronucleus assay (CBMN and PCR-restriction fragment length polymorphism (PCR-RFLP techniques in the lymphocytes from female breast cancer cases and controls. GSACV showed a significantly positive correlation with B6 concentration, and 48 nmol/L of B6 was the most suitable concentration for maintaining GSACV in vitro. The GSACV indexes showed significantly different sensitivity to B6 deficiency between cases and controls; the B6 effect on the GSACV variance contribution of each index was significantly higher than that of genetic polymorphisms and the sample state (tumor state. SHMT C1420T mutations may reduce breast cancer susceptibility, whereas MTRR A66G and MS A2756G mutations may increase breast cancer susceptibility. The role of SHMT, MS, and MTRR genotype polymorphisms in GSACV is reduced compared with that of B6. The results appear to suggest that the long-term lack of B6 under these conditions may increase genetic damage and cell injury and that individuals with various genotypes have different sensitivities to B6 deficiency. FMOCM metabolic enzyme gene polymorphism may be related to breast cancer susceptibility to a certain extent due to the effect of other factors such as stress, hormones, cancer therapies, psychological conditions, and diet. Adequate B6 intake may be good for maintaining genome health and preventing breast cancer.

  9. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam

    2011-01-01

    a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein...... identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one...... involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we...

  10. Matrix attachment regions and structural colinearity in the genomes of two grass species.

    OpenAIRE

    Avramova, Z; Tikhonov, A; Chen, M; Bennetzen, J L

    1998-01-01

    In order to gain insights into the relationship between spatial organization of the genome and genome function we have initiated studies of the co-linear Sh2/A1- homologous regions of rice (30 kb) and sorghum (50 kb). We have identified the locations of matrix attachment regions (MARs) in these homologous chromosome segments, which could serve as anchors for individual structural units or loops. Despite the fact that the nucleotide sequences serving as MARs were not detectably conserved, the ...

  11. Structured RNAs in the ENCODE selected regions of the human genome

    DEFF Research Database (Denmark)

    Washietl, Stefan; Pedersen, Jakob Skou; Korbel, Jan O

    2007-01-01

    Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack...... with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz...

  12. Development of cleaved amplified polymorphic sequence markers and a CAPS-based genetic linkage map in watermelon (Citrullus lanatus [Thunb.] Matsum. and Nakai) constructed using whole-genome re-sequencing data.

    Science.gov (United States)

    Liu, Shi; Gao, Peng; Zhu, Qianglong; Luan, Feishi; Davis, Angela R; Wang, Xiaolu

    2016-03-01

    Cleaved amplified polymorphic sequence (CAPS) markers are useful tools for detecting single nucleotide polymorphisms (SNPs). This study detected and converted SNP sites into CAPS markers based on high-throughput re-sequencing data in watermelon, for linkage map construction and quantitative trait locus (QTL) analysis. Two inbred lines, Cream of Saskatchewan (COS) and LSW-177 had been re-sequenced and analyzed by Perl self-compiled script for CAPS marker development. 88.7% and 78.5% of the assembled sequences of the two parental materials could map to the reference watermelon genome, respectively. Comparative assembled genome data analysis provided 225,693 and 19,268 SNPs and indels between the two materials. 532 pairs of CAPS markers were designed with 16 restriction enzymes, among which 271 pairs of primers gave distinct bands of the expected length and polymorphic bands, via PCR and enzyme digestion, with a polymorphic rate of 50.94%. Using the new CAPS markers, an initial CAPS-based genetic linkage map was constructed with the F2 population, spanning 1836.51 cM with 11 linkage groups and 301 markers. 12 QTLs were detected related to fruit flesh color, length, width, shape index, and brix content. These newly CAPS markers will be a valuable resource for breeding programs and genetic studies of watermelon.

  13. Genome-wide identification of structural variants in genes encoding drug targets

    DEFF Research Database (Denmark)

    Rasmussen, Henrik Berg; Dahmcke, Christina Mackeprang

    2012-01-01

    The objective of the present study was to identify structural variants of drug target-encoding genes on a genome-wide scale. We also aimed at identifying drugs that are potentially amenable for individualization of treatments based on knowledge about structural variation in the genes encoding...

  14. Effect of splice-site polymorphisms of the TMPRSS4, NPHP4 and ...

    Indian Academy of Sciences (India)

    Unknown

    structural changes in mRNA transcripts as a result of splice-site polymorphisms implies that they may be of biological significance in certain pathological conditions. ..... show the genomic structures of the normal (diagram “a”) and abnormal (diagram “b” and “c”) splicing forms. Inserted and deleted sequences are indicated ...

  15. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  16. A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP array

    Directory of Open Access Journals (Sweden)

    Bailey Dione K

    2007-05-01

    Full Text Available Abstract Background DNA copy number aberration (CNA is one of the key characteristics of cancer cells. Recent studies demonstrated the feasibility of utilizing high density single nucleotide polymorphism (SNP genotyping arrays to detect CNA. Compared with the two-color array-based comparative genomic hybridization (array-CGH, the SNP arrays offer much higher probe density and lower signal-to-noise ratio at the single SNP level. To accurately identify small segments of CNA from SNP array data, segmentation methods that are sensitive to CNA while resistant to noise are required. Results We have developed a highly sensitive algorithm for the edge detection of copy number data which is especially suitable for the SNP array-based copy number data. The method consists of an over-sensitive edge-detection step and a test-based forward-backward edge selection step. Conclusion Using simulations constructed from real experimental data, the method shows high sensitivity and specificity in detecting small copy number changes in focused regions. The method is implemented in an R package FASeg, which includes data processing and visualization utilities, as well as libraries for processing Affymetrix SNP array data.

  17. Association Study of Three Gene Polymorphisms Recently Identified by a Genome-Wide Association Study with Obesity-Related Phenotypes in Chinese Children.

    Science.gov (United States)

    Song, Qi-Ying; Song, Jie-Yun; Wang, Yang; Wang, Shuo; Yang, Yi-De; Meng, Xiang-Rui; Ma, Jun; Wang, Hai-Jun; Wang, Yan

    2017-01-01

    This study aimed to examine associations of three single-nucleotide polymorphisms (SNPs) with obesity-related phenotypes in Chinese children. These SNPs were identified by a recent genome-wide association (GWA) study among European children. Given that varied genetic backgrounds across different ethnicity may result in different association, it is necessary to study these associations in a different ethnic population. A total of 3,922 children, including 2,191 normal-weight, 873 overweight and 858 obese children, from three independent studies were included in the study. Logistic and linear regressions were performed, and meta-analyses were conducted to assess the associations between the SNPs and obesity-related phenotypes. The pooled odds ratios of the A-allele of rs564343 in PACS1 for obesity and severe obesity were 1.180 (p = 0.03) and 1.312 (p = 0.004), respectively. We also found that rs564343 was nominally associated with BMI, BMI standard deviation score (BMI-SDS), waist circumference, and waist-to-height ratio (p obesity in a non-European population. This SNP was also found to be associated with common obesity and various obesity-related phenotypes in Chinese children, which had not been reported in the original study. The results demonstrated the value of conducting genetic researches in populations with different ethnicity. © 2017 The Author(s) Published by S. Karger GmbH, Freiburg.

  18. Expanding the structural landscape of niclosamide: a high Z ' polymorph, two new solvates and monohydrate HA

    DEFF Research Database (Denmark)

    Sovago, Ioana; Bond, Andrew D.

    2015-01-01

    to be twinned by twofold rotation around that axis. The acetonitrile molecules occupy channels in the structure. A complete structure is provided for niclosamide monohydrate, C13H8Cl2N2O4·H2O, polymorph HA, obtained by Rietveld refinement against laboratory powder X-ray diffraction data. It has been suggested...... that this compound is related to the methanol solvate of niclosamide [Harriss, Wilson & Radosevljevic Evans (2014). Acta Cryst. C70, 758-763], but it is found that the two are not fully isostructural: they contain isostructural two-dimensional layers, but the layers are arranged differently in the two structures....... This suggests that HA may have the potential for polytypism, and features in the Rietveld difference curve indicate that a polytype fully isostructural with the methanol solvate might be present....

  19. From structure prediction to genomic screens for novel non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Jan Gorodkin

    2011-08-01

    Full Text Available Non-coding RNAs (ncRNAs are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs. A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  20. Terminal restriction fragment length polymorphism analysis of ribosomal RNA genes to assess changes in fungal community structure in soils.

    Science.gov (United States)

    Edel-Hermann, Véronique; Dreumont, Christiane; Pérez-Piqueres, Ana; Steinberg, Christian

    2004-03-01

    Monitoring the structure and dynamics of fungal communities in soils under agricultural and environmental disturbances is currently a challenge. In this study, a terminal restriction fragment length polymorphism (T-RFLP) fingerprinting method was developed for the rapid comparison of fungal community structures. The terminal restriction fragment polymorphism of different regions of the small-subunit (SSU) ribosomal RNA (rRNA) gene was simulated by sequence comparison using 10 restriction enzymes, and analyzed among three different soils using fungal-specific primers. Polymerase chain reaction amplification of the 3' end of the SSU rRNA gene with the primer nu-SSU-0817-5' and with the fluorescently labelled primer nu-SSU-1536-3', and digestion of the amplicons with AluI and MboI were found to be optimal and were used in a standardized T-RFLP procedure. Both the number and the intensity of terminal restriction fragments detected by capillary gel electrophoresis were integrated in correspondence analyses. Three soils with contrasting physicochemical properties were differentiated according to the structure of their fungal communities. Assessment of the impact on the fungal community structure of the amendment of two soils with compost or manure confirmed the reproducibility and the sensitivity of the method. Shifts in the community structure were detected between non-amended and amended soil samples. In both soils, the shift differed with the organic amendment applied. In addition, the fungal community structures of the two soils were affected in a different way by the same organic amendment. The fingerprinting method provides a rapid tool to investigate the effect of various perturbations on the fungal communities in soils.

  1. Protein Production for Structural Genomics Using E. coli Expression

    OpenAIRE

    Makowska-Grzyska, Magdalena; Kim, Youngchang; Maltseva, Natalia; Li, Hui; Zhou, Min; Joachimiak, Grazyna; Babnigg, Gyorgy; Joachimiak, Andrzej

    2014-01-01

    The goal of structural biology is to reveal details of the molecular structure of proteins in order to understand their function and mechanism. X-ray crystallography and NMR are the two best methods for atomic level structure determination. However, these methods require milligram quantities of proteins. In this chapter a reproducible methodology for large-scale protein production applicable to a diverse set of proteins is described. The approach is based on protein expression in E. coli as a...

  2. Exploring the role of genome and structural ions in preventing viral capsid collapse during dehydration

    Science.gov (United States)

    Martín-González, Natalia; Guérin Darvas, Sofía M.; Durana, Aritz; Marti, Gerardo A.; Guérin, Diego M. A.; de Pablo, Pedro J.

    2018-03-01

    Even though viruses evolve mainly in liquid milieu, their horizontal transmission routes often include episodes of dry environment. Along their life cycle, some insect viruses, such as viruses from the Dicistroviridae family, withstand dehydrated conditions with presently unknown consequences to their structural stability. Here, we use atomic force microscopy to monitor the structural changes of viral particles of Triatoma virus (TrV) after desiccation. Our results demonstrate that TrV capsids preserve their genome inside, conserving their height after exposure to dehydrating conditions, which is in stark contrast with other viruses that expel their genome when desiccated. Moreover, empty capsids (without genome) resulted in collapsed particles after desiccation. We also explored the role of structural ions in the dehydration process of the virions (capsid containing genome) by chelating the accessible cations from the external solvent milieu. We observed that ion suppression helps to keep the virus height upon desiccation. Our results show that under drying conditions, the genome of TrV prevents the capsid from collapsing during dehydration, while the structural ions are responsible for promoting solvent exchange through the virion wall.

  3. Genome-Wide Association Study to Identify Single Nucleotide Polymorphisms (SNPs) Associated With the Development of Erectile Dysfunction in African-American Men After Radiotherapy for Prostate Cancer

    International Nuclear Information System (INIS)

    Kerns, Sarah L.; Ostrer, Harry; Stock, Richard; Li, William; Moore, Julian; Pearlman, Alexander; Campbell, Christopher; Shao Yongzhao; Stone, Nelson; Kusnetz, Lynda; Rosenstein, Barry S.

    2010-01-01

    Purpose: To identify single nucleotide polymorphisms (SNPs) associated with erectile dysfunction (ED) among African-American prostate cancer patients treated with external beam radiation therapy. Methods and Materials: A cohort of African-American prostate cancer patients treated with external beam radiation therapy was observed for the development of ED by use of the five-item Sexual Health Inventory for Men (SHIM) questionnaire. Final analysis included 27 cases (post-treatment SHIM score ≤7) and 52 control subjects (post-treatment SHIM score ≥16). A genome-wide association study was performed using approximately 909,000 SNPs genotyped on Affymetrix 6.0 arrays (Affymetrix, Santa Clara, CA). Results: We identified SNP rs2268363, located in the follicle-stimulating hormone receptor (FSHR) gene, as significantly associated with ED after correcting for multiple comparisons (unadjusted p = 5.46 x 10 -8 , Bonferroni p = 0.028). We identified four additional SNPs that tended toward a significant association with an unadjusted p value -6 . Inference of population substructure showed that cases had a higher proportion of African ancestry than control subjects (77% vs. 60%, p = 0.005). A multivariate logistic regression model that incorporated estimated ancestry and four of the top-ranked SNPs was a more accurate classifier of ED than a model that included only clinical variables. Conclusions: To our knowledge, this is the first genome-wide association study to identify SNPs associated with adverse effects resulting from radiotherapy. It is important to note that the SNP that proved to be significantly associated with ED is located within a gene whose encoded product plays a role in male gonad development and function. Another key finding of this project is that the four SNPs most strongly associated with ED were specific to persons of African ancestry and would therefore not have been identified had a cohort of European ancestry been screened. This study demonstrates

  4. The admixed population structure in Danish Jersey dairy cattle challenges accurate genomic predictions

    DEFF Research Database (Denmark)

    Thomasen, Jørn Rind; Sørensen, Anders Christian; Su, Guosheng

    2013-01-01

    The main purpose of this study is to evaluate whether the population structure in Danish Jersey known from the history of the breed also is reflected in the markers. This is done by comparing the linkage disequilibrium and persistence of phase for subgroups of Jersey animals with high proportions...... of Danish or US origin. Furthermore, it is investigated whether a model explicitly incorporating breed origin of animals, inferred either through the known pedigree or from SNP marker data, leads to improved genomic predictions compared to a model ignoring breed origin. The study of the population structure...... origin were analyzed and compared to a basic genomic model that assumes a homogeneous breed structure. The main finding in this study is that the importation of germ plasma from the US Jersey population is readily reflected in the genomes of modern Danish Jersey animals. Firstly, linkage disequilibrium...

  5. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    Energy Technology Data Exchange (ETDEWEB)

    Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang; Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinlzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-24

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  6. Comparative genome structure, secondary metabolite, and effector coding capacity across Cochliobolus pathogens.

    Directory of Open Access Journals (Sweden)

    Bradford J Condon

    Full Text Available The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus, and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI. The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS, polyketide synthase (PKS, and SSP-encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  7. Effects of single nucleotide polymorphisms on human N-acetyltransferase 2 structure and dynamics by molecular dynamics simulation.

    Directory of Open Access Journals (Sweden)

    M Rajasekaran

    Full Text Available BACKGROUND: Arylamine N-acetyltransferase 2 (NAT2 is an important catalytic enzyme that metabolizes the carcinogenic arylamines, hydrazine drugs and chemicals. This enzyme is highly polymorphic in different human populations. Several polymorphisms of NAT2, including the single amino acid substitutions R64Q, I114T, D122N, L137F, Q145P, R197Q, and G286E, are classified as slow acetylators, whereas the wild-type NAT2 is classified as a fast acetylator. The slow acetylators are often associated with drug toxicity and efficacy as well as cancer susceptibility. The biological functions of these 7 mutations have previously been characterized, but the structural basis behind the reduced catalytic activity and reduced protein level is not clear. METHODOLOGY/PRINCIPAL FINDINGS: We performed multiple molecular dynamics simulations of these mutants as well as NAT2 to investigate the structural and dynamical effects throughout the protein structure, specifically the catalytic triad, cofactor binding site, and the substrate binding pocket. None of these mutations induced unfolding; instead, their effects were confined to the inter-domain, domain 3 and 17-residue insert region, where the flexibility was significantly reduced relative to the wild-type. Structural effects of these mutations propagate through space and cause a change in catalytic triad conformation, cofactor binding site, substrate binding pocket size/shape and electrostatic potential. CONCLUSIONS/SIGNIFICANCE: Our results showed that the dynamical properties of all the mutant structures, especially in inter-domain, domain 3 and 17-residue insert region were affected in the same manner. Similarly, the electrostatic potential of all the mutants were altered and also the functionally important regions such as catalytic triad, cofactor binding site, and substrate binding pocket adopted different orientation and/or conformation relative to the wild-type that may affect the functions of the mutants

  8. GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

    Directory of Open Access Journals (Sweden)

    Promponas Vasilis J

    2003-10-01

    Full Text Available Abstract Background The availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies. Results GeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii. Conclusions GeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating

  9. Interactions of early adversity with stress-related gene polymorphisms impact regional brain structure in females.

    Science.gov (United States)

    Gupta, Arpana; Labus, Jennifer; Kilpatrick, Lisa A; Bonyadi, Mariam; Ashe-McNalley, Cody; Heendeniya, Nuwanthi; Bradesi, Sylvie; Chang, Lin; Mayer, Emeran A

    2016-04-01

    Early adverse life events (EALs) have been associated with regional thinning of the subgenual cingulate cortex (sgACC), a brain region implicated in the development of disorders of mood and affect, and often comorbid functional pain disorders, such as irritable bowel syndrome (IBS). Regional neuroinflammation related to chronic stress system activation has been suggested as a possible mechanism underlying these neuroplastic changes. However, the interaction of genetic and environmental factors in these changes is poorly understood. The current study aimed to evaluate the interactions of EALs and candidate gene polymorphisms in influencing thickness of the sgACC. 210 female subjects (137 healthy controls; 73 IBS) were genotyped for stress and inflammation-related gene polymorphisms. Genetic variation with EALs, and diagnosis on sgACC thickness was examined, while controlling for race, age, and total brain volume. Compared to HCs, IBS had significantly reduced sgACC thickness (p = 0.03). Regardless of disease group (IBS vs. HC), thinning of the left sgACC was associated with a significant gene-gene environment interaction between the IL-1β genotype, the NR3C1 haplotype, and a history of EALs (p = 0.05). Reduced sgACC thickness in women with the minor IL-1β allele, was associated with EAL total scores regardless of NR3C1 haplotype status (p = 0.02). In subjects homozygous for the major IL-1β allele, reduced sgACC with increasing levels of EALs was seen only with the less common NR3C1 haplotype (p = 0.02). These findings support an interaction between polymorphisms related to stress and inflammation and early adverse life events in modulating a key region of the emotion arousal circuit.

  10. Analysis of the population structure of Uruguayan Creole cattle as inferred from milk major gene polymorphisms

    Directory of Open Access Journals (Sweden)

    Gonzalo Rincón

    2006-01-01

    Full Text Available The ancestors of Uruguayan Creole cattle were introduced by the Spanish conquerors in the XVII century, following which the population grew extensively and became semi-feral before the introduction of selected breeds. Today the Uruguayan Creole cattle genetic reserve consists of 575 animals. We used the tetra primer amplification refractory mutation system polymerase chain reaction (ARMS-PCR to analyze the kappa-casein, beta-casein, alphaS1-casein and alpha-lactoalbumin gene polymorphisms and restriction fragment length polymorphism PCR (RFLP-PCR for the beta-lactoglobulin and the acylCoA:diacyl glycerol acyltransferase 1 (DGAT1 genes. The kappa-casein and beta-lactoglobulin genes presented very similar A and B allele frequencies, while the alphas1-casein and alpha-lactoalbumin gene B alleles showed much higher frequencies than the corresponding A alleles. The beta-casein B allele was not found in the population sampled. There was a very high frequency of the DGAT1 gene A allele which is associated with low milk fat content and high milk yield. All loci were in Hardy-Weinberg equilibrium and the level of heterozygosity agreed with the high genetic diversity observed in a previous analysis of this population. Preservation of the allelic richness observed in the Uruguayan Creole cattle should be considered for future dairy management and livestock genetic improvement. The results also emphasize the value of the tetra primers ARMS-PCR technique as a rapid, easy and economical way of genotyping cattle breeds for milk gene single nucleotide polymorphisms.

  11. SL1 revisited: functional analysis of the structure and conformation of HIV-1 genome RNA.

    Science.gov (United States)

    Sakuragi, Sayuri; Yokoyama, Masaru; Shioda, Tatsuo; Sato, Hironori; Sakuragi, Jun-Ichi

    2016-11-11

    The dimer initiation site/dimer linkage sequence (DIS/DLS) region of HIV is located on the 5' end of the viral genome and suggested to form complex secondary/tertiary structures. Within this structure, stem-loop 1 (SL1) is believed to be most important and an essential key to dimerization, since the sequence and predicted secondary structure of SL1 are highly stable and conserved among various virus subtypes. In particular, a six-base palindromic sequence is always present at the hairpin loop of SL1 and the formation of kissing-loop structure at this position between the two strands of genomic RNA is suggested to trigger dimerization. Although the higher-order structure model of SL1 is well accepted and perhaps even undoubted lately, there could be stillroom for consideration to depict the functional SL1 structure while in vivo (in virion or cell). In this study, we performed several analyses to identify the nucleotides and/or basepairing within SL1 which are necessary for HIV-1 genome dimerization, encapsidation, recombination and infectivity. We unexpectedly found that some nucleotides that are believed to contribute the formation of the stem do not impact dimerization or infectivity. On the other hand, we found that one G-C basepair involved in stem formation may serve as an alternative dimer interactive site. We also report on our further investigation of the roles of the palindromic sequences on viral replication. Collectively, we aim to assemble a more-comprehensive functional map of SL1 on the HIV-1 viral life cycle. We discovered several possibilities for a novel structure of SL1 in HIV-1 DLS. The newly proposed structure model suggested that the hairpin loop of SL1 appeared larger, and genome dimerization process might consist of more complicated mechanism than previously understood. Further investigations would be still required to fully understand the genome packaging and dimerization of HIV.

  12. Single Nucleotide Polymorphism

    DEFF Research Database (Denmark)

    Børsting, Claus; Pereira, Vania; Andersen, Jeppe Dyrberg

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are the most frequent DNA sequence variations in the genome. They have been studied extensively in the last decade with various purposes in mind. In this chapter, we will discuss the advantages and disadvantages of using SNPs for human identification...

  13. Structural and computational investigations of the conformation of antigenic peptide fragments of human polymorphic epithelial mucin.

    OpenAIRE

    Scanlon, M J; Morley, S D; Jackson, D E; Price, M R; Tendler, S J

    1992-01-01

    Human polymorphic epithelial mucins (PEM) are complex glycoproteins that are associated with breast and ovarian carcinomas. The PEM core protein consists of variable numbers of a tandem repeat sequence which contains a short antigenic hydrophilic region (Pro1-Asp-Thr-Arg-Pro-Ala-Pro7). High-field n.m.r. studies undertaken on antigenic 20- and 11-amino acid fragments of the PEM core protein in dimethyl sulphoxide have identified a type-I beta-turn to be present in the region Pro1-Asp-Thr-Arg4....

  14. Prioritisation of structural variant calls in cancer genomes

    Directory of Open Access Journals (Sweden)

    Miika J. Ahdesmäki

    2017-04-01

    Full Text Available Sensitivity of short read DNA-sequencing for gene fusion detection is improving, but is hampered by the significant amount of noise composed of uninteresting or false positive hits in the data. In this paper we describe a tiered prioritisation approach to extract high impact gene fusion events from existing structural variant calls. Using cell line and patient DNA sequence data we improve the annotation and interpretation of structural variant calls to best highlight likely cancer driving fusions. We also considerably improve on the automated visualisation of the high impact structural variants to highlight the effects of the variants on the resulting transcripts. The resulting framework greatly improves on readily detecting clinically actionable structural variants.

  15. Protein structure similarity clustering (PSSC) and natural product structure as inspiration sources for drug development and chemical genomics

    NARCIS (Netherlands)

    Dekker, Frank J; Koch, Marcus A; Waldmann, Herbert; Dekker, Frans

    Finding small molecules that modulate protein function is of primary importance in drug development and in the emerging field of chemical genomics. To facilitate the identification of such molecules, we developed a novel strategy making use of structural conservatism found in protein domain

  16. Analysis of genetic diversity in Brown Swiss, Jersey and Holstein populations using genome-wide single nucleotide polymorphism markers

    Directory of Open Access Journals (Sweden)

    Melka Melkaye G

    2012-03-01

    Full Text Available Abstract Background Studies of genetic diversity are essential in understanding the extent of differentiation between breeds, and in designing successful diversity conservation strategies. The objective of this study was to evaluate the level of genetic diversity within and between North American Brown Swiss (BS, n = 900, Jersey (JE, n = 2,922 and Holstein (HO, n = 3,535 cattle, using genotyped bulls. GENEPOP and FSTAT software were used to evaluate the level of genetic diversity within each breed and between each pair of the three breeds based on genome-wide SNP markers (n = 50,972. Results Hardy-Weinberg equilibrium (HWE exact test within breeds showed a significant deviation from equilibrium within each population (P st indicated that the combination of BS and HO in an ideally amalgamated population had higher genetic diversity than the other pairs of breeds. Conclusion Results suggest that the three bull populations have substantially different gene pools. BS and HO show the largest gene differentiation and jointly the highest total expected gene diversity compared to when JE is considered. If the loss of genetic diversity within breeds worsens in the future, the use of crossbreeding might be an option to recover genetic diversity, especially for the breeds with small population size.

  17. Unique opportunities for NMR methods in structural genomics.

    Science.gov (United States)

    Montelione, Gaetano T; Arrowsmith, Cheryl; Girvin, Mark E; Kennedy, Michael A; Markley, John L; Powers, Robert; Prestegard, James H; Szyperski, Thomas

    2009-04-01

    This Perspective, arising from a workshop held in July 2008 in Buffalo NY, provides an overview of the role NMR has played in the United States Protein Structure Initiative (PSI), and a vision of how NMR will contribute to the forthcoming PSI-Biology program. NMR has contributed in key ways to structure production by the PSI, and new methods have been developed which are impacting the broader protein NMR community.

  18. Gene order data from a model amphibian (Ambystoma: new perspectives on vertebrate genome structure and evolution

    Directory of Open Access Journals (Sweden)

    Voss S Randal

    2006-08-01

    Full Text Available Abstract Background Because amphibians arise from a branch of the vertebrate evolutionary tree that is juxtaposed between fishes and amniotes, they provide important comparative perspective for reconstructing character changes that have occurred during vertebrate evolution. Here, we report the first comparative study of vertebrate genome structure that includes a representative amphibian. We used 491 transcribed sequences from a salamander (Ambystoma genetic map and whole genome assemblies for human, mouse, rat, dog, chicken, zebrafish, and the freshwater pufferfish Tetraodon nigroviridis to compare gene orders and rearrangement rates. Results Ambystoma has experienced a rate of genome rearrangement that is substantially lower than mammalian species but similar to that of chicken and fish. Overall, we found greater conservation of genome structure between Ambystoma and tetrapod vertebrates, nevertheless, 57% of Ambystoma-fish orthologs are found in conserved syntenies of four or more genes. Comparisons between Ambystoma and amniotes reveal extensive conservation of segmental homology for 57% of the presumptive Ambystoma-amniote orthologs. Conclusion Our analyses suggest relatively constant interchromosomal rearrangement rates from the euteleost ancestor to the origin of mammals and illustrate the utility of amphibian mapping data in establishing ancestral amniote and tetrapod gene orders. Comparisons between Ambystoma and amniotes reveal some of the key events that have structured the human genome since diversification of the ancestral amniote lineage.

  19. The genomic structure of the human UFO receptor.

    Science.gov (United States)

    Schulz, A S; Schleithoff, L; Faust, M; Bartram, C R; Janssen, J W

    1993-02-01

    Using a DNA transfection-tumorigenicity assay we have recently identified the UFO oncogene. It encodes a tyrosine kinase receptor characterized by the juxtaposition of two immunoglobulin-like and two fibronectin type III repeats in its extracellular domain. Here we describe the genomic organization of the human UFO locus. The UFO receptor is encoded by 20 exons that are distributed over a region of 44 kb. Different isoforms of UFO mRNA are generated by alternative splicing of exon 10 and differential usage of two imperfect polyadenylation sites resulting in the presence or absence of 1.5-kb 3' untranslated sequences. Primer extension and S1 nuclease analyses revealed multiple transcriptional initiation sites including a major site 169 bp upstream of the translation start site. The promoter region is GC rich, lacks TATA and CAAT boxes, but contains potential recognition sites for a variety of trans-acting factors, including Sp1, AP-2 and the cyclic AMP response element-binding protein. Proto-UFO and its oncogenic counterpart exhibit identical cDNA and promoter regions sequences. Possible modes of UFO activation are discussed.

  20. Genome structure and primitive sex chromosome revealed in Populus

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Yin, Tongming [ORNL; Gunter, Lee E [ORNL; Blaudez, D [UMR, France

    2008-01-01

    We constructed a comprehensive genetic map for Populus and ordered 332 Mb of sequence scaffolds along the 19 haploid chromosomes in order to compare chromosomal regions among diverse members of the genus. These efforts lead us to conclude that chromosome XIX in Populus is evolving into a sex chromosome. Consistent segregation distortion in favor of the sub-genera Tacamahaca alleles provided evidence of divergent selection among species, particularly at the proximal end of chromosome XIX. A large microsatellite marker (SSR) cluster was detected in the distorted region even though the genome-wide distribute SSR sites was uniform across the physical map. The differences between the genetic map and physical sequence data suggested recombination suppression was occurring in the distorted region. A gender-determination locus and an overabundance of NBS-LRR genes were also co-located to the distorted region and were put forth as the cause for divergent selection and recombination suppression. This hypothesis was verified by using fine-scale mapping of an integrated scaffold in the vicinity of the gender-determination locus. As such it appears that chromosome XIX in Populus is in the process of evolving from an autosome into a sex chromosome and that NBS-LRR genes may play important role in the chromosomal diversification process in Populus.

  1. Association of kinase insert domain-containing receptor (KDR gene polymorphism/ haplotypes with recurrent spontaneous abortion and genetic structure

    Directory of Open Access Journals (Sweden)

    Shiva Shahsavari

    2015-12-01

    Full Text Available Background: Recurrent spontaneous abortion is one of the diseases that can lead to physical, psychological, and, economical problems for both individuals and society. Recently a few numbers of genetic polymorphisms in kinase insert domain-containing receptor (KDR gene are examined that can endanger the life of the fetus in pregnant women. Objective: The risk of KDR gene polymorphisms was investigated in Iranian women with idiopathic recurrent spontaneous abortion (RSA. Materials and Methods: A case controlled study was performed. One hundred idiopathic recurrent spontaneous abortion patients with at least two consecutive pregnancy losses before 20 weeks of gestational age with normal karyotypes were included in the study. Also, 100 healthy women with at least one natural pregnancy were studied as control group. Two functional SNPs located in KDR gene; rs1870377 (Q472H, and rs2305948 (V297I as well as one tag SNP in the intron region (rs6838752 were genotyped by using PCR based restriction fragment length polymorphism (PCR-RFLP technique. Haplotype frequency was determined for these three SNPs’ genotypes. Analysis of genetic STRUCTURE and K means clustering were performed to study genetic variation. Results: Functional SNP (rs1870377 was highly linked to tag SNP (rs6838752 (D´ value=0. 214; χ2 = 16.44, p<0. 001. K means clustering showed that k = 8 as the best fit for the optimal number of genetic subgroups in our studied materials. This result was in agreement with Neighbor Joining cluster analysis. Conclusion: In our study, the allele and genotype frequencies were not associated with RSA between patient and control individuals. Inconsistent results in different populations with different allele frequencies among RSA patients and controls may be due to ethnic variation and used sample size.

  2. Evaluation of the frequency of polymorphisms in XRCC1 (Arg399Gln) and XPD (Lys751Gln) genes related to the genome stability maintenance in individuals of the resident population from Monte Alegre, PA/Brazil municipality

    International Nuclear Information System (INIS)

    Duarte, Isabelle Magliano

    2010-01-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on Earth. Ionizing radiation is a known genotoxic agent, which can affect biological molecules, causing DNA damage and genomic instability. The cellular system of DNA repair plays an important role in maintaining genomic stability by repairing DNA damage caused by genotoxic agents. However, genes related to DNA repair may have their role committed when presenting a certain polymorphism. This study intended to analyze the frequency of single nucleotide polymorphisms (SNPs) in genes of DNA repair XRCC1 (Arg39-9Gln) and XPD (Lys751Gln) in a: population of the city of Monte Alegre, that resides in an area of high exposure to natural radioactivity. Samples of saliva were collected from individuals of the population of Monte Alegre, in which 40 samples were of male and 46 female. Through the use of RFLP (length polymorphism restriction fragment) the frequency of homozygous genotypes and / or heterozygous was determined for polymorphic genes. The XRCC1 gene had 65.4% of the presence of the allele 399Gln and XPD gene had 32.9% of the 751Gln allele. These values are similar to those found in previous studies for the XPD gene, whereas XRCC1 showed a frequency much higher than described in the literature. The. influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology for risk assessment of cancer in the population of Monte Alegre. (author)

  3. 3D-GNOME: an integrated web service for structural modeling of the 3D genome.

    Science.gov (United States)

    Szalaj, Przemyslaw; Michalski, Paul J; Wróblewski, Przemysław; Tang, Zhonghui; Kadlof, Michal; Mazzocco, Giovanni; Ruan, Yijun; Plewczynski, Dariusz

    2016-07-08

    Recent advances in high-throughput chromosome conformation capture (3C) technology, such as Hi-C and ChIA-PET, have demonstrated the importance of 3D genome organization in development, cell differentiation and transcriptional regulation. There is now a widespread need for computational tools to generate and analyze 3D structural models from 3C data. Here we introduce our 3D GeNOme Modeling Engine (3D-GNOME), a web service which generates 3D structures from 3C data and provides tools to visually inspect and annotate the resulting structures, in addition to a variety of statistical plots and heatmaps which characterize the selected genomic region. Users submit a bedpe (paired-end BED format) file containing the locations and strengths of long range contact points, and 3D-GNOME simulates the structure and provides a convenient user interface for further analysis. Alternatively, a user may generate structures using published ChIA-PET data for the GM12878 cell line by simply specifying a genomic region of interest. 3D-GNOME is freely available at http://3dgnome.cent.uw.edu.pl/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure

    DEFF Research Database (Denmark)

    Torarinsson, Elfar; Sawera, Milena; Havgaard, Jakob Hull

    2006-01-01

    Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never...... been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural...... alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions...

  5. Synthesis and structural, spectroscopic and magnetic studies of two new polymorphs of Mn(SeO3).H2O

    International Nuclear Information System (INIS)

    Larranaga, Aitor; Mesa, Jose L.; Pizarro, Jose L.; Pena, A.; Olazcuaga, Roger; Arriortua, Maria I.; Rojo, Teofilo

    2005-01-01

    Two new manganese(II) selenite polymorphs with formula Mn(SeO 3 ).H 2 O have been synthesized by slow evaporation from an aqueous solution. The crystal structure of both compounds (1) and (2) have been solved from X-ray diffraction data. The structure of (1) was determined from single-crystal X-ray diffraction techniques. The compound crystallizes in the Ama2 space group, with a=5.817(1), b=13.449(3), c=4.8765(9)A and Z=4. The structure of (2) has been solved from X-ray powder diffraction data. This phase crystallizes in the P2 1 /n space group with unit-cell parameters of a=4.921(3), b=13.121(7), c=5.816(1)A, β=90.03(2) o and Z=4. Both polymorphs exhibit a layered structure formed by isolated sheets of MnO 6 octahedra and (SeO 3 ) 2- trigonal pyramids in the (010) plane. These layers, which contain one manganese and selenium atom crystallographically independent, are formed by octahedra linked between them through the selenite oxoanions. The difference of both compounds consists in the stacking of the layers along the b-axis. The IR spectra show the characteristic bands of the selenite anion. Studies of luminescence performed at 6K and diffuse reflectance spectroscopy have been carried out for both phases. The Dq and Racah (B and C) parameters, from luminescence and diffuse reflectance spectroscopy, are Dq=705, B=750, C=3325cm -1 for (1) and Dq=720, B=745, C=3350cm -1 for (2). The ESR spectra of both compounds are isotropic with g-values of 1.99(1). Magnetic measurements indicate the presence of antiferromagnetic couplings in both phases. The J-exchange parameters have been estimated by fitting the experimental magnetic data to a model for square-planar lattice. The values obtained are J/k=-0.83, -0.91K and J ' /k=-0.97, -1.20K, for polymorphs (1) and (2), respectively

  6. Bioinformatical approaches to RNA structure prediction & Sequencing of an ancient human genome

    DEFF Research Database (Denmark)

    Lindgreen, Stinus

    tools that exist. The second part has been focused on the mapping and genotyping of ancient genomic DNA. The development of next generation sequencing technologies combined with the use of ancient DNA material present the researchers with some special challenges in the analyses. This work resulted...... in the publication of the first genome of an ancient human individual, where close to the theoretical maximum of the genome sequence was recovered with high confidence. Part of the project was the development of the program SNPest for genotyping and SNP calling that models various sources of error and predicts...... in families of related RNA sequences. Also, the program MASTR was developed to perform simultaneous alignment of multiple RNA sequences and prediction of a common secondary structure. The webserver WAR was developed to make it easy for non-computer savy researchers to use the many RNA structure prediction...

  7. Detection of Genomic Structural Variants from Next-Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Lorenzo eTattini

    2015-06-01

    Full Text Available Structural variants are genomic rearrangements larger than 50 bp accounting for around1% of the variation among human genomes. They impact on phenotypic diversityand play a role in various diseases including neurological/neurocognitive disordersand cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approacheshave been proposed in the literature. In this mini review we describe and summarisethe latest tools – and their underlying algorithms – designed for the analysis ofwhole-genome sequencing, whole-exome sequencing, custom captures and ampliconsequencing data, pointing out the major advantages/drawbacks. We also report asummary of the most recent applications of third-generation sequencing platforms.This assessment provides a guided indication – with particular emphasis on humangenetics and copy number variants – for researchers involved in the investigation of thesegenomic events.

  8. Evolution of the Exon-Intron Structure in Ciliate Genomes.

    Directory of Open Access Journals (Sweden)

    Vladyslav S Bondarenko

    Full Text Available A typical eukaryotic gene is comprised of alternating stretches of regions, exons and introns, retained in and spliced out a mature mRNA, respectively. Although the length of introns may vary substantially among organisms, a large fraction of genes contains short introns in many species. Notably, some Ciliates (Paramecium and Nyctotherus possess only ultra-short introns, around 25 bp long. In Paramecium, ultra-short introns with length divisible by three (3n are under strong evolutionary pressure and have a high frequency of in-frame stop codons, which, in the case of intron retention, cause premature termination of mRNA translation and consequent degradation of the mis-spliced mRNA by the nonsense-mediated decay mechanism. Here, we analyzed introns in five genera of Ciliates, Paramecium, Tetrahymena, Ichthyophthirius, Oxytricha, and Stylonychia. Introns can be classified into two length classes in Tetrahymena and Ichthyophthirius (with means 48 bp, 69 bp, and 55 bp, 64 bp, respectively, but, surprisingly, comprise three distinct length classes in Oxytricha and Stylonychia (with means 33-35 bp, 47-51 bp, and 78-80 bp. In most ranges of the intron lengths, 3n introns are underrepresented and have a high frequency of in-frame stop codons in all studied species. Introns of Paramecium, Tetrahymena, and Ichthyophthirius are preferentially located at the 5' and 3' ends of genes, whereas introns of Oxytricha and Stylonychia are strongly skewed towards the 5' end. Analysis of evolutionary conservation shows that, in each studied genome, a significant fraction of intron positions is conserved between the orthologs, but intron lengths are not correlated between the species. In summary, our study provides a detailed characterization of introns in several genera of Ciliates and highlights some of their distinctive properties, which, together, indicate that splicing spellchecking is a universal and evolutionarily conserved process in the biogenesis of short

  9. Development and validation of polymorphic microsatellite loci for the NA2 lineage of Phytophthora ramorum from whole genome sequence data

    Science.gov (United States)

    Phytophthora ramorum is the causal agent of sudden oak death and sudden larch death, and is also responsible for causing ramorum blight on woody ornamental plants. Many microsatellite markers are available to characterize the genetic diversity and population structure of P. ramorum. However, only tw...

  10. Genomic data illuminates demography, genetic structure and selection of a popular dog breed.

    Science.gov (United States)

    Wiener, Pamela; Sánchez-Molano, Enrique; Clements, Dylan N; Woolliams, John A; Haskell, Marie J; Blott, Sarah C

    2017-08-14

    Genomic methods have proved to be important tools in the analysis of genetic diversity across the range of species and can be used to reveal processes underlying both short- and long-term evolutionary change. This study applied genomic methods to investigate population structure and inbreeding in a common UK dog breed, the Labrador Retriever. We found substantial within-breed genetic differentiation, which was associated with the role of the dog (i.e. working, pet, show) and also with coat colour (i.e. black, yellow, brown). There was little evidence of geographical differentiation. Highly differentiated genomic regions contained genes and markers associated with skull shape, suggesting that at least some of the differentiation is related to human-imposed selection on this trait. We also found that the total length of homozygous segments (runs of homozygosity, ROHs) was highly correlated with inbreeding coefficient. This study demonstrates that high-density genomic data can be used to quantify genetic diversity and to decipher demographic and selection processes. Analysis of genetically differentiated regions in the UK Labrador Retriever population suggests the possibility of human-imposed selection on craniofacial characteristics. The high correlation between estimates of inbreeding from genomic and pedigree data for this breed demonstrates that genomic approaches can be used to quantify inbreeding levels in dogs, which will be particularly useful where pedigree information is missing.

  11. Characteristics of de novo structural changes in the human genome

    NARCIS (Netherlands)

    Kloosterman, Wigard P.; Francioli, Laurent C.; Hormozdiari, Fereydoun; Marschall, Tobias; Hehir-Kwa, Jayne Y.; Abdellaoui, Abdel; Lameijer, Eric-Wubbo; Moed, Matthijs H.; Koval, Vyacheslav; Renkens, Ivo; van Roosmalen, Markus J.; Arp, Pascal; Karssen, Lennart C.; Coe, Bradley P.; Handsaker, Robert E.; Suchiman, Eka D.; Cuppen, Edwin; Thung, Djie Tjwan; McVey, Mitch; Wendl, Michael C.; Uitterlinden, Andre; van Duijn, Cornelia M.; Swertz, Morris A.; Wijmenga, Cisca; van Ommen, GertJan B.; Slagboom, P. Eline; Boomsma, Dorret I.; Schoenhuth, Alexander; Eichler, Evan E.; de Bakker, Paul I. W.; Ye, Kai; Guryev, Victor

    Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo indels and SVs in the general population have remained largely unexplored. We report 332 validated de

  12. DNA sequence polymorphisms within the bovine guanine nucleotide-binding protein Gs subunit alpha (Gsα-encoding (GNAS genomic imprinting domain are associated with performance traits

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2011-01-01

    Full Text Available Abstract Background Genes which are epigenetically regulated via genomic imprinting can be potential targets for artificial selection during animal breeding. Indeed, imprinted loci have been shown to underlie some important quantitative traits in domestic mammals, most notably muscle mass and fat deposition. In this candidate gene study, we have identified novel associations between six validated single nucleotide polymorphisms (SNPs spanning a 97.6 kb region within the bovine guanine nucleotide-binding protein Gs subunit alpha gene (GNAS domain on bovine chromosome 13 and genetic merit for a range of performance traits in 848 progeny-tested Holstein-Friesian sires. The mammalian GNAS domain consists of a number of reciprocally-imprinted, alternatively-spliced genes which can play a major role in growth, development and disease in mice and humans. Based on the current annotation of the bovine GNAS domain, four of the SNPs analysed (rs43101491, rs43101493, rs43101485 and rs43101486 were located upstream of the GNAS gene, while one SNP (rs41694646 was located in the second intron of the GNAS gene. The final SNP (rs41694656 was located in the first exon of transcripts encoding the putative bovine neuroendocrine-specific protein NESP55, resulting in an aspartic acid-to-asparagine amino acid substitution at amino acid position 192. Results SNP genotype-phenotype association analyses indicate that the single intronic GNAS SNP (rs41694646 is associated (P ≤ 0.05 with a range of performance traits including milk yield, milk protein yield, the content of fat and protein in milk, culled cow carcass weight and progeny carcass conformation, measures of animal body size, direct calving difficulty (i.e. difficulty in calving due to the size of the calf and gestation length. Association (P ≤ 0.01 with direct calving difficulty (i.e. due to calf size and maternal calving difficulty (i.e. due to the maternal pelvic width size was also observed at the rs

  13. DNA sequence polymorphisms within the bovine guanine nucleotide-binding protein Gs subunit alpha (Gsα)-encoding (GNAS) genomic imprinting domain are associated with performance traits

    Science.gov (United States)

    2011-01-01

    Background Genes which are epigenetically regulated via genomic imprinting can be potential targets for artificial selection during animal breeding. Indeed, imprinted loci have been shown to underlie some important quantitative traits in domestic mammals, most notably muscle mass and fat deposition. In this candidate gene study, we have identified novel associations between six validated single nucleotide polymorphisms (SNPs) spanning a 97.6 kb region within the bovine guanine nucleotide-binding protein Gs subunit alpha gene (GNAS) domain on bovine chromosome 13 and genetic merit for a range of performance traits in 848 progeny-tested Holstein-Friesian sires. The mammalian GNAS domain consists of a number of reciprocally-imprinted, alternatively-spliced genes which can play a major role in growth, development and disease in mice and humans. Based on the current annotation of the bovine GNAS domain, four of the SNPs analysed (rs43101491, rs43101493, rs43101485 and rs43101486) were located upstream of the GNAS gene, while one SNP (rs41694646) was located in the second intron of the GNAS gene. The final SNP (rs41694656) was located in the first exon of transcripts encoding the putative bovine neuroendocrine-specific protein NESP55, resulting in an aspartic acid-to-asparagine amino acid substitution at amino acid position 192. Results SNP genotype-phenotype association analyses indicate that the single intronic GNAS SNP (rs41694646) is associated (P ≤ 0.05) with a range of performance traits including milk yield, milk protein yield, the content of fat and protein in milk, culled cow carcass weight and progeny carcass conformation, measures of animal body size, direct calving difficulty (i.e. difficulty in calving due to the size of the calf) and gestation length. Association (P ≤ 0.01) with direct calving difficulty (i.e. due to calf size) and maternal calving difficulty (i.e. due to the maternal pelvic width size) was also observed at the rs43101491 SNP. Following

  14. Visualizing the global secondary structure of a viral RNA genome with cryo-electron microscopy.

    Science.gov (United States)

    Garmann, Rees F; Gopal, Ajaykumar; Athavale, Shreyas S; Knobler, Charles M; Gelbart, William M; Harvey, Stephen C

    2015-05-01

    The lifecycle, and therefore the virulence, of single-stranded (ss)-RNA viruses is regulated not only by their particular protein gene products, but also by the secondary and tertiary structure of their genomes. The secondary structure of the entire genomic RNA of satellite tobacco mosaic virus (STMV) was recently determined by selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE). The SHAPE analysis suggested a single highly extended secondary structure with much less branching than occurs in the ensemble of structures predicted by purely thermodynamic algorithms. Here we examine the solution-equilibrated STMV genome by direct visualization with cryo-electron microscopy (cryo-EM), using an RNA of similar length transcribed from the yeast genome as a control. The cryo-EM data reveal an ensemble of branching patterns that are collectively consistent with the SHAPE-derived secondary structure model. Thus, our results both elucidate the statistical nature of the secondary structure of large ss-RNAs and give visual support for modern RNA structure determination methods. Additionally, this work introduces cryo-EM as a means to distinguish between competing secondary structure models if the models differ significantly in terms of the number and/or length of branches. Furthermore, with the latest advances in cryo-EM technology, we suggest the possibility of developing methods that incorporate restraints from cryo-EM into the next generation of algorithms for the determination of RNA secondary and tertiary structures. © 2015 Garmann et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  15. Integrated view of genome structure and sequence of a single DNA molecule in a nanofluidic device

    DEFF Research Database (Denmark)

    Marie, Rodolphe; Pedersen, Jonas Nyvold; L. V. Bauer, David

    2013-01-01

    We show how a bird’s-eye view of genomic structure can be obtained at ∼1-kb resolution from long (∼2 Mb) DNA molecules extracted from whole chromosomes in a nanofluidic laboratoryon-a-chip. We use an improved single-molecule denaturation mapping approach to detect repetitive elements and known...

  16. An Overview of the Genetic Structure within the Italian Population from Genome-Wide Data

    Science.gov (United States)

    Di Gaetano, Cornelia; Voglino, Floriana; Guarrera, Simonetta; Fiorito, Giovanni; Rosa, Fabio; Di Blasio, Anna Maria; Manzini, Paola; Dianzani, Irma; Betti, Marta; Cusi, Daniele; Frau, Francesca; Barlassina, Cristina; Mirabelli, Dario; Magnani, Corrado; Glorioso, Nicola; Bonassi, Stefano; Piazza, Alberto; Matullo, Giuseppe

    2012-01-01

    In spite of the common belief of Europe as reasonably homogeneous at genetic level, advances in high-throughput genotyping technology have resolved several gradients which define different geographical areas with good precision. When Northern and Southern European groups were considered separately, there were clear genetic distinctions. Intra-country genetic differences were also evident, especially in Finland and, to a lesser extent, within other European populations. Here, we present the first analysis using the 125,799 genome-wide Single Nucleotide Polymorphisms (SNPs) data of 1,014 Italians with wide geographical coverage. We showed by using Principal Component analysis and model-based individual ancestry analysis, that the current population of Sardinia can be clearly differentiated genetically from mainland Italy and Sicily, and that a certain degree of genetic differentiation is detectable within the current Italian peninsula population. Pair-wise FST statistics Northern and Southern Italy amounts approximately to 0.001 between, and around 0.002 between Northern Italy and Utah residents with Northern and Western European ancestry (CEU). The Italian population also revealed a fine genetic substructure underscoring by the genomic inflation (Sardinia vs. Northern Italy = 3.040 and Northern Italy vs. CEU = 1.427), warning against confounding effects of hidden relatedness and population substructure in association studies. PMID:22984441

  17. Genomic Variability of Mycobacterium tuberculosis Strains of the Euro-American Lineage Based on Large Sequence Deletions and 15-Locus MIRU-VNTR Polymorphism

    Science.gov (United States)

    Rindi, Laura; Medici, Chiara; Bimbi, Nicola; Buzzigoli, Andrea; Lari, Nicoletta; Garzelli, Carlo

    2014-01-01

    A sample of 260 Mycobacterium tuberculosis strains assigned to the Euro-American family was studied to identify phylogenetically informative genomic regions of difference (RD). Mutually exclusive deletions of regions RD115, RD122, RD174, RD182, RD183, RD193, RD219, RD726 and RD761 were found in 202 strains; the RDRio deletion was detected exclusively among the RD174-deleted strains. Although certain deletions were found more frequently in certain spoligotype families (i.e., deletion RD115 in T and LAM, RD174 in LAM, RD182 in Haarlem, RD219 in T and RD726 in the “Cameroon” family), the RD-defined sublineages did not specifically match with spoligotype-defined families, thus arguing against the use of spoligotyping for establishing exact phylogenetic relationships between strains. Notably, when tested for katG463/gyrA95 polymorphism, all the RD-defined sublineages belonged to Principal Genotypic Group (PGG) 2, except sublineage RD219 exclusively belonging to PGG3; the 58 Euro-American strains with no deletion were of either PGG2 or 3. A representative sample of 197 isolates was then analyzed by standard 15-locus MIRU-VNTR typing, a suitable approach to independently assess genetic relationships among the strains. Analysis of the MIRU-VNTR typing results by using a minimum spanning tree (MST) and a classical dendrogram showed groupings that were largely concordant with those obtained by RD-based analysis. Isolates of a given RD profile show, in addition to closely related MIRU-VNTR profiles, related spoligotype profiles that can serve as a basis for better spoligotype-based classification. PMID:25197794

  18. Genomic analysis of the hierarchical structure of regulatory networks

    Science.gov (United States)

    Yu, Haiyuan; Gerstein, Mark

    2006-01-01

    A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace “chain-of-command” structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein–protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are “control bottlenecks” in the hierarchy, and this great degree of control for “middle managers” has parallels in efficient social structures in various corporate and governmental settings. PMID:17003135

  19. Mapping the structure and dynamics of genomics-related MeSH terms complex networks.

    Science.gov (United States)

    Siqueiros-García, Jesús M; Hernández-Lemus, Enrique; García-Herrera, Rodrigo; Robina-Galatas, Andrea

    2014-01-01

    It has been proposed that the history and evolution of scientific ideas may reflect certain aspects of the underlying socio-cognitive frameworks in which science itself is developing. Systematic analyses of the development of scientific knowledge may help us to construct models of the collective dynamics of science. Aiming at scientific rigor, these models should be built upon solid empirical evidence, analyzed with formal tools leading to ever-improving results that support the related conclusions. Along these lines we studied the dynamics and structure of the development of research in genomics as represented by the entire collection of genomics-related scientific papers contained in the PubMed database. The analyzed corpus consisted in more than 49,000 articles published in the years 1987 (first appearance of the term Genomics) to 2011, categorized by means of the Medical Subheadings (MeSH) content-descriptors. Complex networks were built where two MeSH terms were connected if they are descriptors of the same article(s). The analysis of such networks revealed a complex structure and dynamics that to certain extent resembled small-world networks. The evolution of such networks in time reflected interesting phenomena in the historical development of genomic research, including what seems to be a phase-transition in a period marked by the completion of the first draft of the Human Genome Project. We also found that different disciplinary areas have different dynamic evolution patterns in their MeSH connectivity networks. In the case of areas related to science, changes in topology were somewhat fast while retaining a certain core-structure, whereas in the humanities, the evolution was pretty slow and the structure resulted highly redundant and in the case of technology related issues, the evolution was very fast and the structure remained tree-like with almost no overlapping terms.

  20. Mapping the structure and dynamics of genomics-related MeSH terms complex networks.

    Directory of Open Access Journals (Sweden)

    Jesús M Siqueiros-García

    Full Text Available It has been proposed that the history and evolution of scientific ideas may reflect certain aspects of the underlying socio-cognitive frameworks in which science itself is developing. Systematic analyses of the development of scientific knowledge may help us to construct models of the collective dynamics of science. Aiming at scientific rigor, these models should be built upon solid empirical evidence, analyzed with formal tools leading to ever-improving results that support the related conclusions. Along these lines we studied the dynamics and structure of the development of research in genomics as represented by the entire collection of genomics-related scientific papers contained in the PubMed database. The analyzed corpus consisted in more than 49,000 articles published in the years 1987 (first appearance of the term Genomics to 2011, categorized by means of the Medical Subheadings (MeSH content-descriptors. Complex networks were built where two MeSH terms were connected if they are descriptors of the same article(s. The analysis of such networks revealed a complex structure and dynamics that to certain extent resembled small-world networks. The evolution of such networks in time reflected interesting phenomena in the historical development of genomic research, including what seems to be a phase-transition in a period marked by the completion of the first draft of the Human Genome Project. We also found that different disciplinary areas have different dynamic evolution patterns in their MeSH connectivity networks. In the case of areas related to science, changes in topology were somewhat fast while retaining a certain core-structure, whereas in the humanities, the evolution was pretty slow and the structure resulted highly redundant and in the case of technology related issues, the evolution was very fast and the structure remained tree-like with almost no overlapping terms.

  1. Nucleosomes shape DNA polymorphism and divergence.

    Directory of Open Access Journals (Sweden)

    Sasha A Langley

    2014-07-01

    Full Text Available An estimated 80% of genomic DNA in eukaryotes is packaged as nucleosomes, which, together with the remaining interstitial linker regions, generate higher order chromatin structures [1]. Nucleosome sequences isolated from diverse organisms exhibit ∼10 bp periodic variations in AA, TT and GC dinucleotide frequencies. These sequence elements generate intrinsically curved DNA and help establish the histone-DNA interface. We investigated an important unanswered question concerning the interplay between chromatin organization and genome evolution: do the DNA sequence preferences inherent to the highly conserved histone core exert detectable natural selection on genomic divergence and polymorphism? To address this hypothesis, we isolated nucleosomal DNA sequences from Drosophila melanogaster embryos and examined the underlying genomic variation within and between species. We found that divergence along the D. melanogaster lineage is periodic across nucleosome regions with base changes following preferred nucleotides, providing new evidence for systematic evolutionary forces in the generation and maintenance of nucleosome-associated dinucleotide periodicities. Further, Single Nucleotide Polymorphism (SNP frequency spectra show striking periodicities across nucleosomal regions, paralleling divergence patterns. Preferred alleles occur at higher frequencies in natural populations, consistent with a central role for natural selection. These patterns are stronger for nucleosomes in introns than in intergenic regions, suggesting selection is stronger in transcribed regions where nucleosomes undergo more displacement, remodeling and functional modification. In addition, we observe a large-scale (∼180 bp periodic enrichment of AA/TT dinucleotides associated with nucleosome occupancy, while GC dinucleotide frequency peaks in linker regions. Divergence and polymorphism data also support a role for natural selection in the generation and maintenance of these

  2. Structural genomic variation in childhood epilepsies with complex phenotypes

    DEFF Research Database (Denmark)

    Helbig, Ingo; Swinkels, Marielle E M; Aten, Emmelien

    2014-01-01

    A genetic contribution to a broad range of epilepsies has been postulated, and particularly copy number variations (CNVs) have emerged as significant genetic risk factors. However, the role of CNVs in patients with epilepsies with complex phenotypes is not known. Therefore, we investigated the role...... of CNVs in patients with unclassified epilepsies and complex phenotypes. A total of 222 patients from three European countries, including patients with structural lesions on magnetic resonance imaging (MRI), dysmorphic features, and multiple congenital anomalies, were clinically evaluated and screened...

  3. Structural genomics reveals EVE as a new ASCH/PUA-related domain

    Science.gov (United States)

    Bertonati, Claudia; Punta, Marco; Fischer, Markus; Yachdav, Guy; Forouhar, Farhad; Zhou, Weihong; Kuzin, Alexander P.; Seetharaman, Jayaraman; Abashidze, Mariam; Ramelot, Theresa A.; Kennedy, Michael A.; Cort, John R.; Belachew, Adam; Hunt, John F.; Tong, Liang; Montelione, Gaetano T.; Rost, Burkhard

    2014-01-01

    Summary We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they share a similar structural core, characterized by a pseudobarrel five-stranded beta sheet. This core corresponds to the PUA domain-like architecture in the SCOP database. By connecting sequence information with structural knowledge, we characterize a new subgroup of these proteins that we propose to be distinctly different from previously described PUA domain-like domains such as PUA proper or ASCH. We refer to these newly defined domains as EVE. Although EVE may have retained the ability of PUA domains to bind RNA, the available experimental and computational data suggests that both the details of its molecular function and its cellular function differ from those of other PUA domain-like domains. This study of EVE and its relatives illustrates how the combination of structure and genomics creates new insights by connecting a cornucopia of structures that map to the same evolutionary potential. Primary sequence information alone would have not been sufficient to reveal these evolutionary links. PMID:19191354

  4. Genomic structure in Europeans dating back at least 36,200 years

    DEFF Research Database (Denmark)

    Seguin-Orlando, Andaine; Korneliussen, Thorfinn Sand; Sikora, Martin

    2014-01-01

    The origin of contemporary Europeans remains contentious. We obtained a genome sequence from Kostenki 14 in European Russia dating from 38,700 to 36,200 years ago, one of the oldest fossils of anatomically modern humans from Europe. We find that Kostenki 14 shares a close ancestry with the 24...... European Neolithic farmers. We find that Kostenki 14 contains more Neandertal DNA that is contained in longer tracts than present Europeans. Our findings reveal the timing of divergence of western Eurasians and East Asians to be more than 36,200 years ago and that European genomic structure today dates...

  5. Gene finding with a hidden Markov model of genome structure and evolution

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Hein, Jotun

    2003-01-01

    the model are linear in alignment length and genome number. The model is applied to the problem of gene finding. The benefit of modelling sequence evolution is demonstrated both in a range of simulations and on a set of orthologous human/mouse gene pairs. AVAILABILITY: Free availability over the Internet...... annotation. The modelling of evolution by the existing comparative gene finders leaves room for improvement. Results: A probabilistic model of both genome structure and evolution is designed. This type of model is called an Evolutionary Hidden Markov Model (EHMM), being composed of an HMM and a set of region...

  6. Family Polymorphism

    DEFF Research Database (Denmark)

    Ernst, Erik

    2001-01-01

    This paper takes polymorphism to the multi-object level. Traditional inheritance, polymorphism, and late binding interact nicely to provide both flexibility and safety — when a method is invoked on an object via a polymorphic reference, late binding ensures that we get the appropriate implementat......This paper takes polymorphism to the multi-object level. Traditional inheritance, polymorphism, and late binding interact nicely to provide both flexibility and safety — when a method is invoked on an object via a polymorphic reference, late binding ensures that we get the appropriate...

  7. Structural and functional characterization of hBD-1(Ser35), a peptide deduced from a DEFB1 polymorphism.

    Science.gov (United States)

    Circo, Raffaella; Skerlavaj, Barbara; Gennaro, Renato; Amoroso, Antonio; Zanetti, Margherita

    2002-04-26

    beta-Defensins are mammalian antimicrobial peptides that share a unique disulfide-bonding motif of six conserved cysteines. An intragenic polymorphism of the DEFB1 gene that changes a highly conserved Cys to Ser in the peptide coding region has recently been described. The deduced peptide cannot form three disulfide bonds, as one of the cysteines is unpaired. We have determined the cysteine connectivities of a corresponding synthetic hBD-1(Ser35) peptide, investigated the structure by circular dichroism spectroscopy, and assayed the in vitro antimicrobial activity. Despite a different arrangement of the disulfides, hBD-1(Ser35) proved as active as hBD-1 against the microorganisms tested. This activity likely depends on the ability of hBD-1(Ser35) to adopt an amphipathic conformation in hydrophobic environment, similar to the wild type peptide, as suggested by CD spectroscopy.

  8. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    Science.gov (United States)

    2011-09-01

    Almasy, L, Blangero, J. (2009) Human QTL linkage mapping. Genetica 136:333-340. Amos, CI. (2007) Successful design and conduct of genome-wide...quantitative trait loci. Genetica 136:237-243. Skol AD, Scott LJ, Abecasis GR, Boehnke M. (2006) Joint analysis is more efficient than replication

  9. Nucleotide polymorphism in the 5.8S nrDNA gene and internal transcribed spacers in Phakopsora pachyrhizi viewed from structural models.

    Science.gov (United States)

    Freire, Maíra Cristina Menezes; da Silva, Maria Roméria; Zhang, Xuecheng; Almeida, Álvaro Manuel Rodrigues; Stacey, Gary; de Oliveira, Luiz Orlando

    2012-02-01

    The assessment of nucleotide polymorphisms in environmental samples of obligate pathogens requires DNA amplification through the polymerase chain reaction (PCR) and bacterial cloning of PCR products prior to sequencing. The drawback of this strategy is that it can give rise to false polymorphisms owing to DNA polymerase misincorporation during PCR or bacterial cloning. We investigated patterns of nucleotide polymorphism in the internal transcribed spacer (ITS) region for Phakopsora pachyrhizi, an obligate biotrophic fungus that causes the Asian soybean rust. Field-collected samples of P. pachyrhizi were obtained from all major soybean production areas worldwide, including Brazil and the United States. Bacterially-cloned, PCR products were obtained using a high fidelity DNA polymerase. A total of 370 ITS sequences that were subjected to an array of complementary sequence analyses, which included analyses of secondary structure stability, the pattern of nucleotide polymorphisms, GC content, and the presence of conserved motifs. The sequences exhibited features of functional rRNAs. Overall, polymorphisms took place within less conserved motives, such as loops and bulges; alternatively, they gave rise to non-canonical G-U pairs within conserved regions of double stranded helices. We discuss the usefulness of structural analyses to filter out putative 'suspicious' bacterially cloned ITS sequences, thus keeping artificially-induced sequence variation to a minimum. Copyright © 2011 Elsevier Inc. All rights reserved.

  10. Initial sequencing and comparative analysis of the mouse genome.

    Science.gov (United States)

    Waterston, Robert H; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R; Brown, Daniel G; Brown, Stephen D; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T; Church, Deanna M; Clamp, Michele; Clee, Christopher; Collins, Francis S; Cook, Lisa L; Copley, Richard R; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D; Deri, Justin; Dermitzakis, Emmanouil T; Dewey, Colin; Dickens, Nicholas J; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M; Eddy, Sean R; Elnitski, Laura; Emes, Richard D; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A; Flicek, Paul; Foley, Karen; Frankel, Wayne N; Fulton, Lucinda A; Fulton, Robert S; Furey, Terrence S; Gage, Diane; Gibbs, Richard A; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A; Green, Eric D; Gregory, Simon; Guigó, Roderic; Guyer, Mark; Hardison, Ross C; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B; Johnson, L Steven; Jones, Matthew; Jones, Thomas A; Joy, Ann; Kamal, Michael; Karlsson, Elinor K; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W James; Kirby, Andrew; Kolbe, Diana L; Korf, Ian; Kucherlapati, Raju S; Kulbokas, Edward J; Kulp, David; Landers, Tom; Leger, J P; Leonard, Steven; Letunic, Ivica; Levine, Rosie; Li, Jia; Li, Ming; Lloyd, Christine; Lucas, Susan; Ma, Bin; Maglott, Donna R; Mardis, Elaine R; Matthews, Lucy; Mauceli, Evan; Mayer, John H; McCarthy, Megan; McCombie, W Richard; McLaren, Stuart; McLay, Kirsten; McPherson, John D; Meldrim, Jim; Meredith, Beverley; Mesirov, Jill P; Miller, Webb; Miner, Tracie L; Mongin, Emmanuel; Montgomery, Kate T; Morgan, Michael; Mott, Richard; Mullikin, James C; Muzny, Donna M; Nash, William E; Nelson, Joanne O; Nhan, Michael N; Nicol, Robert; Ning, Zemin; Nusbaum, Chad; O'Connor, Michael J; Okazaki, Yasushi; Oliver, Karen; Overton-Larty, Emma; Pachter, Lior; Parra, Genís; Pepin, Kymberlie H; Peterson, Jane; Pevzner, Pavel; Plumb, Robert; Pohl, Craig S; Poliakov, Alex; Ponce, Tracy C; Ponting, Chris P; Potter, Simon; Quail, Michael; Reymond, Alexandre; Roe, Bruce A; Roskin, Krishna M; Rubin, Edward M; Rust, Alistair G; Santos, Ralph; Sapojnikov, Victor; Schultz, Brian; Schultz, Jörg; Schwartz, Matthias S; Schwartz, Scott; Scott, Carol; Seaman, Steven; Searle, Steve; Sharpe, Ted; Sheridan, Andrew; Shownkeen, Ratna; Sims, Sarah; Singer, Jonathan B; Slater, Guy; Smit, Arian; Smith, Douglas R; Spencer, Brian; Stabenau, Arne; Stange-Thomann, Nicole; Sugnet, Charles; Suyama, Mikita; Tesler, Glenn; Thompson, Johanna; Torrents, David; Trevaskis, Evanne; Tromp, John; Ucla, Catherine; Ureta-Vidal, Abel; Vinson, Jade P; Von Niederhausern, Andrew C; Wade, Claire M; Wall, Melanie; Weber, Ryan J; Weiss, Robert B; Wendl, Michael C; West, Anthony P; Wetterstrand, Kris; Wheeler, Raymond; Whelan, Simon; Wierzbowski, Jamey; Willey, David; Williams, Sophie; Wilson, Richard K; Winter, Eitan; Worley, Kim C; Wyman, Dudley; Yang, Shan; Yang, Shiaw-Pyng; Zdobnov, Evgeny M; Zody, Michael C; Lander, Eric S

    2002-12-05

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  11. Structural Polymorphism in "Kesterite" Cu2ZnSnS4: Raman Spectroscopy and First-Principles Calculations Analysis.

    Science.gov (United States)

    Dimitrievska, Mirjana; Boero, Federica; Litvinchuk, Alexander P; Delsante, Simona; Borzone, Gabriella; Perez-Rodriguez, Alejandro; Izquierdo-Roca, Victor

    2017-03-20

    This work presents a comprehensive analysis of the structural and vibrational properties of the kesterite Cu 2 ZnSnS 4 (CZTS, I4̅ space group) as well as its polymorphs with the space groups P4̅2c and P4̅2m, from both experimental and theoretical point of views. Multiwavelength Raman scattering measurements performed on bulk CZTS polycrystalline samples were utilized to experimentally determine properties of the most intense Raman modes expected in these crystalline structures according to group theory analysis. The experimental results compare well with the vibrational frequencies that have been computed by first-principles calculations based on density functional theory. Vibrational patterns of the most intense fully symmetric modes corresponding to the P4̅2c structure were compared with the corresponding modes in the I4̅ CZTS structure. The results point to the need to look beyond the standard phases (kesterite and stannite) of CZTS while exploring and explaining the electronic and vibrational properties of these materials, as well as the possibility of using Raman spectroscopy as an effective technique for detecting the presence of different crystallographic modifications within the same material.

  12. Fosmid library end sequencing reveals a rarely known genome structure of marine shrimp Penaeus monodon

    Directory of Open Access Journals (Sweden)

    Chen Ming

    2011-05-01

    Full Text Available Abstract Background The black tiger shrimp (Penaeus monodon is one of the most important aquaculture species in the world, representing the crustacean lineage which possesses the greatest species diversity among marine invertebrates. Yet, we barely know anything about their genomic structure. To understand the organization and evolution of the P. monodon genome, a fosmid library consisting of 288,000 colonies and was constructed, equivalent to 5.3-fold coverage of the 2.17 Gb genome. Approximately 11.1 Mb of fosmid end sequences (FESs from 20,926 non-redundant reads representing 0.45% of the P. monodon genome were obtained for repetitive and protein-coding sequence analyses. Results We found that microsatellite sequences were highly abundant in the P. monodon genome, comprising 8.3% of the total length. The density and the average length of microsatellites were evidently higher in comparison to those of other taxa. AT-rich microsatellite motifs, especially poly (AT and poly (AAT, were the most abundant. High abundance of microsatellite sequences were also found in the transcribed regions. Furthermore, via self-BlastN analysis we identified 103 novel repetitive element families which were categorized into four groups, i.e., 33 WSSV-like repeats, 14 retrotransposons, 5 gene-like repeats, and 51 unannotated repeats. Overall, various types of repeats comprise 51.18% of the P. monodon genome in length. Approximately 7.4% of the FESs contained protein-coding sequences, and the Inhibitor of Apoptosis Protein (IAP gene and the Innexin 3 gene homologues appear to be present in high abundance in the P. monodon genome. Conclusions The redundancy of various repeat types in the P. monodon genome illustrates its highly repetitive nature. In particular, long and dense microsatellite sequences as well as abundant WSSV-like sequences highlight the uniqueness of genome organization of penaeid shrimp from those of other taxa. These results provide substantial

  13. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  14. Complete Chloroplast Genomes of Papaver rhoeas and Papaver orientale: Molecular Structures, Comparative Analysis, and Phylogenetic Analysis

    Directory of Open Access Journals (Sweden)

    Jianguo Zhou

    2018-02-01

    Full Text Available Papaver rhoeas L. and P. orientale L., which belong to the family Papaveraceae, are used as ornamental and medicinal plants. The chloroplast genome has been used for molecular markers, evolutionary biology, and barcoding identification. In this study, the complete chloroplast genome sequences of P. rhoeas and P. orientale are reported. Results show that the complete chloroplast genomes of P. rhoeas and P. orientale have typical quadripartite structures, which are comprised of circular 152,905 and 152,799-bp-long molecules, respectively. A total of 130 genes were identified in each genome, including 85 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence divergence analysis of four species from Papaveraceae indicated that the most divergent regions are found in the non-coding spacers with minimal differences among three Papaver species. These differences include the ycf1 gene and intergenic regions, such as rpoB-trnC, trnD-trnT, petA-psbJ, psbE-petL, and ccsA-ndhD. These regions are hypervariable regions, which can be used as specific DNA barcodes. This finding suggested that the chloroplast genome could be used as a powerful tool to resolve the phylogenetic positions and relationships of Papaveraceae. These results offer valuable information for future research in the identification of Papaver species and will benefit further investigations of these species.

  15. De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture

    Science.gov (United States)

    Di Pierro, Michele; Cheng, Ryan R.; Lieberman Aiden, Erez; Wolynes, Peter G.; Onuchic, José N.

    2017-01-01

    Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. PMID:29087948

  16. De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture.

    Science.gov (United States)

    Di Pierro, Michele; Cheng, Ryan R; Lieberman Aiden, Erez; Wolynes, Peter G; Onuchic, José N

    2017-11-14

    Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. Copyright © 2017 the Author(s). Published by PNAS.

  17. Complete plastid genomes from Ophioglossum californicum, Psilotum nudum, and Equisetum hyemale reveal an ancestral land plant genome structure and resolve the position of Equisetales among monilophytes

    Directory of Open Access Journals (Sweden)

    Grewe Felix

    2013-01-01

    Full Text Available Abstract Background Plastid genome structure and content is remarkably conserved in land plants. This widespread conservation has facilitated taxon-rich phylogenetic analyses that have resolved organismal relationships among many land plant groups. However, the relationships among major fern lineages, especially the placement of Equisetales, remain enigmatic. Results In order to understand the evolution of plastid genomes and to establish phylogenetic relationships among ferns, we sequenced the plastid genomes from three early diverging species: Equisetum hyemale (Equisetales, Ophioglossum californicum (Ophioglossales, and Psilotum nudum (Psilotales. A comparison of fern plastid genomes showed that some lineages have retained inverted repeat (IR boundaries originating from the common ancestor of land plants, while other lineages have experienced multiple IR changes including expansions and inversions. Genome content has remained stable throughout ferns, except for a few lineage-specific losses of genes and introns. Notably, the losses of the rps16 gene and the rps12i346 intron are shared among Psilotales, Ophioglossales, and Equisetales, while the gain of a mitochondrial atp1 intron is shared between Marattiales and Polypodiopsida. These genomic structural changes support the placement of Equisetales as sister to Ophioglossales + Psilotales and Marattiales as sister to Polypodiopsida. This result is augmented by some molecular phylogenetic analyses that recover the same relationships, whereas others suggest a relationship between Equisetales and Polypodiopsida. Conclusions Although molecular analyses were inconsistent with respect to the position of Marattiales and Equisetales, several genomic structural changes have for the first time provided a clear placement of these lineages within the ferns. These results further demonstrate the power of using rare genomic structural changes in cases where molecular data fail to provide strong phylogenetic

  18. Structural polymorphism in the promoter of pfmrp2 confers Plasmodium falciparum tolerance to quinoline drugs

    Science.gov (United States)

    Mok, Sachel; Liong, Kek-Yee; Lim, Eng-How; Huang, Ximei; Zhu, Lei; Preiser, Peter Rainer; Bozdech, Zbynek

    2014-01-01

    Drug resistance in Plasmodium falciparum remains a challenge for the malaria eradication programmes around the world. With the emergence of artemisinin resistance, the efficacy of the partner drugs in the artemisinin combination therapies (ACT) that include quinoline-based drugs is becoming critical. So far only few resistance markers have been identified from which only two transmembrane transporters namely PfMDR1 (an ATP-binding cassette transporter) and PfCRT (a drug-metabolite transporter) have been experimentally verified. Another P. falciparum transporter, the ATP-binding cassette containing multidrug resistance-associated protein (PfMRP2) represents an additional possible factor of drug resistance in P. falciparum. In this study, we identified a parasite clone that is derived from the 3D7 P. falciparum strain and shows increased resistance to chloroquine, mefloquine and quinine through the trophozoite and schizont stages. We demonstrate that the resistance phenotype is caused by a 4.1 kb deletion in the 5′ upstream region of the pfmrp2 gene that leads to an alteration in the pfmrp2 transcription and thus increased level of PfMRP2 protein. These results also suggest the importance of putative promoter elements in regulation of gene expression during the P. falciparum intra-erythrocytic developmental cycle and the potential of genetic polymorphisms within these regions to underlie drug resistance. PMID:24372851

  19. Binuclear Copper(I Borohydride Complex Containing Bridging Bis(diphenylphosphino Methane Ligands: Polymorphic Structures of [(µ2-dppm2Cu2(η2-BH42] Dichloromethane Solvate

    Directory of Open Access Journals (Sweden)

    Natalia V. Belkova

    2017-10-01

    Full Text Available Bis(diphenylphosphinomethane copper(I tetrahydroborate was synthesized by ligands exchange in bis(triphenylphosphine copper(I tetrahydroborate, and characterized by XRD, FTIR, NMR spectroscopy. According to XRD the title compound has dimeric structure, [(μ2-dppm2Cu2(η2-BH42], and crystallizes as CH2Cl2 solvate in two polymorphic forms (orthorhombic, 1, and monoclinic, 2 The details of molecular geometry and the crystal-packing pattern in polymorphs were studied. The rare Twisted Boat-Boat conformation of the core Cu2P4C2 cycle in 1 is found being more stable than Boat-Boat conformation in 2.

  20. Insight into the strong aggregation-induced emission of low-conjugated racemic C6-unsubstituted tetrahydropyrimidines through crystal-structure-property relationship of polymorphs.

    Science.gov (United States)

    Zhu, Qiuhua; Zhang, Yilin; Nie, Han; Zhao, Zujin; Liu, Shuwen; Wong, Kam Sing; Tang, Ben Zhong

    2015-08-01

    Racemic C6-unsubstituted tetrahydropyrimidines (THPs) are a series of fluorophores with a strong aggregation-induced emission (AIE) effect. However, they do not possess the structural features of conventional AIE compounds. In order to understand their AIE mechanism, here, the influences of the molecular packing mode and the conformation on the optical properties of THPs were investigated using seven crystalline polymorphs of three THPs ( 1-3 ). The racemic THPs 1-3 have low-conjugated and highly flexible molecular structures, and hence show practically no emission in different organic solvents. However, the fluorescence quantum yields of their polymorphs are up to 93%, and the maximum excitation ( λ ex ) and emission ( λ em ) wavelengths of the polymorphs are long at 409 and 484 nm, respectively. Single-crystal structures and theoretical calculation of the HOMOs and LUMOs based on the molecular conformations of these polymorphs indicate that the polymorphs with the shortest λ ex and λ em values possess a RS -packing mode ( R - and S -enantiomers self-assemble as paired anti-parallel lines) and a more twisted conformation without through-space conjugation between the dicarboxylates, but the polymorphs with longer λ ex and λ em values adopt a RR / SS -packing mode ( R - and S -enantiomers self-assemble as unpaired zigzag lines) and a less twisted conformation with through-space conjugation between the dicarboxylates. The molecular conformations of 1-3 in all these polymorphs are stereo and more twisted than those in solution. Although 1-3 are poorly conjugated, the radiative rate constants ( k r ) of their polymorphs are as large as conventional fluorophores (0.41-1.03 × 10 8 s -1 ) because of improved electronic conjugation by both through-bond and through-space interactions. Based on the obtained results, it can be deduced that the strong AIE arises not only from the restriction of intramolecular motion but also from enhanced electronic coupling and

  1. Elucidating the role of transcription in shaping the 3D structure of the bacterial genome

    Science.gov (United States)

    Brandao, Hugo B.; Wang, Xindan; Rudner, David Z.; Mirny, Leonid

    Active transcription has been linked to several genome conformation changes in bacteria, including the recruitment of chromosomal DNA to the cell membrane and formation of nucleoid clusters. Using genomic and imaging data as input into mathematical models and polymer simulations, we sought to explore the extent to which bacterial 3D genome structure could be explained by 1D transcription tracks. Using B. subtilis as a model organism, we investigated via polymer simulations the role of loop extrusion and DNA super-coiling on the formation of interaction domains and other fine-scale features that are visible in chromosome conformation capture (Hi-C) data. We then explored the role of the condensin structural maintenance of chromosome complex on the alignment of chromosomal arms. A parameter-free transcription traffic model demonstrated that mean chromosomal arm alignment can be quantitatively explained, and the effects on arm alignment in genomically rearranged strains of B. subtilis were accurately predicted. H.B. acknowledges support from the Natural Sciences and Engineering Research Council of Canada for a PGS-D fellowship.

  2. Genetic Diversity and Population Structure of Toona Ciliata Roem. Based on Sequence-Related Amplified Polymorphism (SRAP) Markers

    OpenAIRE

    Li, Pei; Zhan, Xin; Que, Qingmin; Qu, Wenting; Liu, Mingqian; Ouyang, Kunxi; Li, Juncheng; Deng, Xiaomei; Zhang, Junjie; Liao, Boyong; Pian, Ruiqi; Chen, Xiaoyang

    2015-01-01

    Sequence-related amplified polymorphism (SRAP) markers were used to investigate the genetic diversity among 30 populations of Toona ciliata Roem. sampled from the species’ distribution area in China. To analyze the polymorphism in the SRAP profiles, 1505 primer pairs were screened and 24 selected. A total of 656 SRAP bands ranging from 100 to 1500 bp were acquired, of these 505 bands (77%) were polymorphic. The polymorphism information content (PIC) values ranged from 0.32 to 0.45, with an av...

  3. Biophysical characterization of recombinant proteins: A key to higher structural genomics success

    Science.gov (United States)

    Vedadi, Masoud; Arrowsmith, Cheryl H.; Allali-Hassani, Abdellah; Senisterra, Guillermo; Wasney, Gregory A.

    2010-01-01

    Hundreds of genomes have been successfully sequenced to date, and the data are publicly available. At the same time, the advances in large-scale expression and purification of recombinant proteins have paved the way for structural genomics efforts. Frequently, however, little is known about newly expressed proteins calling for large-scale protein characterization to better understand their biochemical roles and to enable structure–function relationship studies. In the Structural Genomics Consortium (SGC), we have established a platform to characterize large numbers of purified proteins. This includes screening for ligands, enzyme assays, peptide arrays and peptide displacement in a 384-well format. In this review, we describe this platform in more detail and report on how our approach significantly increases the success rate for structure determination. Coupled with high-resolution X-ray crystallography and structure-guided methods, this platform can also be used toward the development of chemical probes through screening families of proteins against a variety of chemical series and focused chemical libraries. PMID:20466062

  4. Structural features based genome-wide characterization and prediction of nucleosome organization

    Directory of Open Access Journals (Sweden)

    Gan Yanglan

    2012-03-01

    Full Text Available Abstract Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence

  5. Nonclinical and Clinical Enterococcus faecium Strains, but Not Enterococcus faecalis Strains, Have Distinct Structural and Functional Genomic Features

    Science.gov (United States)

    Kim, Eun Bae

    2014-01-01

    Certain strains of Enterococcus faecium and Enterococcus faecalis contribute beneficially to animal health and food production, while others are associated with nosocomial infections. To determine whether there are structural and functional genomic features that are distinct between nonclinical (NC) and clinical (CL) strains of those species, we analyzed the genomes of 31 E. faecium and 38 E. faecalis strains. Hierarchical clustering of 7,017 orthologs found in the E. faecium pangenome revealed that NC strains clustered into two clades and are distinct from CL strains. NC E. faecium genomes are significantly smaller than CL genomes, and this difference was partly explained by significantly fewer mobile genetic elements (ME), virulence factors (VF), and antibiotic resistance (AR) genes. E. faecium ortholog comparisons identified 68 and 153 genes that are enriched for NC and CL strains, respectively. Proximity analysis showed that CL-enriched loci, and not NC-enriched loci, are more frequently colocalized on the genome with ME. In CL genomes, AR genes are also colocalized with ME, and VF are more frequently associated with CL-enriched loci. Genes in 23 functional groups are also differentially enriched between NC and CL E. faecium genomes. In contrast, differences were not observed between NC and CL E. faecalis genomes despite their having larger genomes than E. faecium. Our findings show that unlike E. faecalis, NC and CL E. faecium strains are equipped with distinct structural and functional genomic features indicative of adaptation to different environments. PMID:24141120

  6. Revision of the Crystal Structure of the First Molecular Polymorph in History

    DEFF Research Database (Denmark)

    Johansson, Kristoffer E.; Van De Streek, Jacco

    2016-01-01

    computational crystal structure prediction (CSP) method based on dispersion-corrected density functional theory, we correctly predict the stable form I with the lowest energy among all sampled structures and its polytypic form III with slightly higher energy. From Rietveld refinement of selected CSP models...

  7. Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.

    Science.gov (United States)

    Dolle, Dirk D; Liu, Zhicheng; Cotten, Matthew; Simpson, Jared T; Iqbal, Zamin; Durbin, Richard; McCarthy, Shane A; Keane, Thomas M

    2017-02-01

    We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution. © 2017 Dolle et al.; Published by Cold Spring Harbor Laboratory Press.

  8. Effects of aneuploidy on genome structure, expression, and interphase organization in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Bruno Huettel

    2008-10-01

    Full Text Available Aneuploidy refers to losses and/or gains of individual chromosomes from the normal chromosome set. The resulting gene dosage imbalance has a noticeable affect on the phenotype, as illustrated by aneuploid syndromes, including Down syndrome in humans, and by human solid tumor cells, which are highly aneuploid. Although the phenotypic manifestations of aneuploidy are usually apparent, information about the underlying alterations in structure, expression, and interphase organization of unbalanced chromosome sets is still sparse. Plants generally tolerate aneuploidy better than animals, and, through colchicine treatment and breeding strategies, it is possible to obtain inbred sibling plants with different numbers of chromosomes. This possibility, combined with the genetic and genomics tools available for Arabidopsis thaliana, provides a powerful means to assess systematically the molecular and cytological consequences of aberrant numbers of specific chromosomes. Here, we report on the generation of Arabidopsis plants in which chromosome 5 is present in triplicate. We compare the global transcript profiles of normal diploids and chromosome 5 trisomics, and assess genome integrity using array comparative genome hybridization. We use live cell imaging to determine the interphase 3D arrangement of transgene-encoded fluorescent tags on chromosome 5 in trisomic and triploid plants. The results indicate that trisomy 5 disrupts gene expression throughout the genome and supports the production and/or retention of truncated copies of chromosome 5. Although trisomy 5 does not grossly distort the interphase arrangement of fluorescent-tagged sites on chromosome 5, it may somewhat enhance associations between transgene alleles. Our analysis reveals the complex genomic changes that can occur in aneuploids and underscores the importance of using multiple experimental approaches to investigate how chromosome numerical changes condition abnormal phenotypes and

  9. Density functional simulations of structure and polymorphism in Ga/Sb films.

    Science.gov (United States)

    Kalikka, J; Akola, J; Jones, R O

    2013-03-20

    Thin films of gallium/antimony alloys are promising candidates for phase change memories requiring rapid crystallization at high crystallization temperatures. Prominent examples are the stoichiometric form GaSb and alloys near the eutectic composition GaSb(7), but little is known about their amorphous structures or the differences between the 'as-deposited' (AD) and 'melt-quenched' (MQ) forms. We have generated these structures using 528-atom density functional/molecular dynamics simulations, and we have studied in detail and compared structural parameters (pair distribution functions, structure factors, coordination numbers, bond and ring size distributions) and electronic properties (densities of states, bond orders) for all structures. There is good agreement with x-ray diffraction data from deposited films of GaSb, and there is evidence for Sb segregation in GaSb(7).

  10. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  11. Whole genome comparison between table and wine grapes reveals a comprehensive catalog of structural variants.

    Science.gov (United States)

    Di Genova, Alex; Almeida, Andrea Miyasaka; Muñoz-Espinoza, Claudia; Vizoso, Paula; Travisany, Dante; Moraga, Carol; Pinto, Manuel; Hinrichsen, Patricio; Orellana, Ariel; Maass, Alejandro

    2014-01-07

    Grapevine (Vitis vinifera L.) is the most important Mediterranean fruit crop, used to produce both wine and spirits as well as table grape and raisins. Wine and table grape cultivars represent two divergent germplasm pools with different origins and domestication history, as well as differential characteristics for berry size, cluster architecture and berry chemical profile, among others. 'Sultanina' plays a pivotal role in modern table grape breeding providing the main source of seedlessness. This cultivar is also one of the most planted for fresh consumption and raisins production. Given its importance, we sequenced it and implemented a novel strategy for the de novo assembly of its highly heterozygous genome. Our approach produced a draft genome of 466 Mb, recovering 82% of the genes present in the grapevine reference genome; in addition, we identified 240 novel genes. A large number of structural variants and SNPs were identified. Among them, 45 (21 SNPs and 24 INDELs) were experimentally confirmed in 'Sultanina' and six SNPs in other 23 table grape varieties. Transposable elements corresponded to ca. 80% of the repetitive sequences involved in structural variants and more than 2,000 genes were affected in their structure by these variants. Some of these genes are likely involved in embryo development, suggesting that they may contribute to seedlessness, a key trait for table grapes. This work produced the first structural variants and SNPs catalog for grapevine, constituting a novel and very powerful tool for genomic studies in this key fruit crop, particularly useful to support marker assisted breeding in table grapes.

  12. Universal Internucleotide Statistics in Full Genomes: A Footprint of the DNA Structure and Packaging?

    OpenAIRE

    Bogachev, Mikhail I.; Kayumov, Airat R.; Bunde, Armin

    2014-01-01

    Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleot...

  13. Genome-wide single-nucleotide polymorphism data reveal cryptic species within cryptic freshwater snail species-The case of theAncylus fluviatilisspecies complex.

    Science.gov (United States)

    Weiss, Martina; Weigand, Hannah; Weigand, Alexander M; Leese, Florian

    2018-01-01

    DNA barcoding utilizes short standardized DNA sequences to identify species and is increasingly used in biodiversity assessments. The technique has unveiled an unforeseeably high number of morphologically cryptic species. However, if speciation has occurred relatively recently and rapidly, the use of single gene markers, and especially the exclusive use of mitochondrial markers, will presumably fail in delimitating species. Therefore, the true number of biological species might be even higher. One mechanism that can result in rapid speciation is hybridization of different species in combination with polyploidization, that is, allopolyploid speciation. In this study, we analyzed the population genetic structure of the polyploid freshwater snail Ancylus fluviatilis , for which allopolyploidization was postulated as a speciation mechanism. DNA barcoding has already revealed four cryptic species within A. fluviatilis (i.e., A. fluviatilis s. str., Ancylus sp. A-C), but early allozyme data even hint at the presence of additional cryptic lineages in Central Europe. We combined COI sequencing with high-resolution genome-wide SNP data (ddRAD data) to analyze the genetic structure of A. fluviatilis populations in a Central German low mountain range (Sauerland). The ddRAD data results indicate the presence of three cryptic species within A. fluviatilis s. str. occurring in sympatry and even syntopy, whereas mitochondrial sequence data only support the existence of one species, with shared haplotypes between species. Our study hence points to the limitations of DNA barcoding when dealing with organismal groups where speciation is assumed to have occurred rapidly, for example, through the process of allopolyploidization. We therefore emphasize that single marker DNA barcoding can underestimate the true species diversity and argue in strong favor of using genome-wide data for species delimitation in such groups.

  14. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Directory of Open Access Journals (Sweden)

    Loren Baugh

    Full Text Available The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite.We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq. We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail.This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against

  15. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Science.gov (United States)

    Baugh, Loren; Gallagher, Larry A; Patrapuvich, Rapatbhorn; Clifton, Matthew C; Gardberg, Anna S; Edwards, Thomas E; Armour, Brianna; Begley, Darren W; Dieterich, Shellie H; Dranow, David M; Abendroth, Jan; Fairman, James W; Fox, David; Staker, Bart L; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W; Stacy, Robin; Myler, Peter J; Stewart, Lance J; Manoil, Colin; Van Voorhis, Wesley C

    2013-01-01

    The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases

  16. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Directory of Open Access Journals (Sweden)

    Chuan Hong

    2014-12-01

    Full Text Available Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  17. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Science.gov (United States)

    Hong, Chuan; Oksanen, Hanna M; Liu, Xiangan; Jakana, Joanita; Bamford, Dennis H; Chiu, Wah

    2014-12-01

    Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds) DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM) and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  18. Genomic Structure of an Economically Important Cyanobacterium, Arthrospira (Spirulina) platensis NIES-39

    Science.gov (United States)

    Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki

    2010-01-01

    A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057

  19. Single-cell paired-end genome sequencing reveals structural variation per cell cycle

    Science.gov (United States)

    Voet, Thierry; Kumar, Parveen; Van Loo, Peter; Cooke, Susanna L.; Marshall, John; Lin, Meng-Lay; Zamani Esteki, Masoud; Van der Aa, Niels; Mateiu, Ligia; McBride, David J.; Bignell, Graham R.; McLaren, Stuart; Teague, Jon; Butler, Adam; Raine, Keiran; Stebbings, Lucy A.; Quail, Michael A.; D’Hooghe, Thomas; Moreau, Yves; Futreal, P. Andrew; Stratton, Michael R.; Vermeesch, Joris R.; Campbell, Peter J.

    2013-01-01

    The nature and pace of genome mutation is largely unknown. Because standard methods sequence DNA from populations of cells, the genetic composition of individual cells is lost, de novo mutations in cells are concealed within the bulk signal and per cell cycle mutation rates and mechanisms remain elusive. Although single-cell genome analyses could resolve these problems, such analyses are error-prone because of whole-genome amplification (WGA) artefacts and are limited in the types of DNA mutation that can be discerned. We developed methods for paired-end sequence analysis of single-cell WGA products that enable (i) detecting multiple classes of DNA mutation, (ii) distinguishing DNA copy number changes from allelic WGA-amplification artefacts by the discovery of matching aberrantly mapping read pairs among the surfeit of paired-end WGA and mapping artefacts and (iii) delineating the break points and architecture of structural variants. By applying the methods, we capture DNA copy number changes acquired over one cell cycle in breast cancer cells and in blastomeres derived from a human zygote after in vitro fertilization. Furthermore, we were able to discover and fine-map a heritable inter-chromosomal rearrangement t(1;16)(p36;p12) by sequencing a single blastomere. The methods will expedite applications in basic genome research and provide a stepping stone to novel approaches for clinical genetic diagnosis. PMID:23630320

  20. Discovery of Black Dye Crystal Structure Polymorphs: Implications for Dye Conformational Variation in Dye-Sensitized Solar Cells.

    Science.gov (United States)

    Cole, Jacqueline M; Low, Kian Sing; Gong, Yun

    2015-12-23

    We present the discovery of a new crystal structure polymorph (1) and pseudopolymorph (2) of the Black Dye, one of the world's leading dyes for dye-sensitized solar cells, DSSCs (10.4% device performance efficiency). This reveals that Black Dye molecules can adopt multiple low-energy conformers. This is significant since it challenges existing models of the Black Dye···TiO2 adsorption process that renders a DSSC working electrode; these have assumed a single molecular conformation that refers to the previously reported Black Dye crystal structure (3). The marked structural differences observed between 1, 2, and 3 make the need for modeling multiple conformations more acute. Additionally, the ordered form of the Black Dye (1) provides a more appropriate depiction of its anionic structure, especially regarding its anchoring group and NCS bonding descriptions. The tendency toward NCS ligand isomerism, evidenced via the disordered form 2, has consequences for electron injection and electron recombination in Black Dye embedded DSSC devices. Dyes 2 and 3 differ primarily by the absence or presence of a solvent of crystallization, respectively; solvent environment effects on the dye are thereby elucidated. This discovery of multiple Black Dye conformers from diffraction, with atomic-level definition, complements recently reported nanoscopic evidence for multiple dye conformations existing at a dye···TiO2 interface, for a chemically similar DSSC dye; those results emanated from imaging and spectroscopy, but were unresolved at the submolecular level. Taken together, these findings lead to the general notion that multiple dye conformations should be explicitly considered when modeling dye···TiO2 interfaces in DSSCs, at least for ruthenium-based dye complexes.

  1. Structural genomics: keeping up with expanding knowledge of the protein universe

    Science.gov (United States)

    Grabowski, Marek; Joachimiak, Andrzej; Otwinowski, Zbyszek; Minor, Wladek

    2010-01-01

    Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space — a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a reassessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006. PMID:17587562

  2. TRFolder-W: a web server for telomerase RNA structure prediction in yeast genomes.

    Science.gov (United States)

    Zhang, Dong; Xue, Xingran; Malmberg, Russell L; Cai, Liming

    2012-10-15

    TRFolder-W is a web server capable of predicting core structures of telomerase RNA (TR) in yeast genomes. TRFolder is a command-line Python toolkit for TR-specific structure prediction. We developed a web-version built on the django web framework, leveraging the work done previously, to include enhancements to increase flexibility of usage. To date, there are five core sub-structures commonly found in TR of fungal species, which are the template region, downstream pseudoknot, boundary element, core-closing stem and triple helix. The aim of TRFolder-W is to use the five core structures as fundamental units to predict potential TR genes for yeast, and to provide a user-friendly interface. Moreover, the application of TRFolder-W can be extended to predict the characteristic structure on species other than fungal species. The web server TRFolder-W is available at http://rna-informatics.uga.edu/?f=software&p=TRFolder-w.

  3. StructureFold: genome-wide RNA secondary structure mapping and reconstruction in vivo.

    Science.gov (United States)

    Tang, Yin; Bouvier, Emil; Kwok, Chun Kit; Ding, Yiliang; Nekrutenko, Anton; Bevilacqua, Philip C; Assmann, Sarah M

    2015-08-15

    RNAs fold into complex structures that are integral to the diverse mechanisms underlying RNA regulation of gene expression. Recent development of transcriptome-wide RNA structure profiling through the application of structure-probing enzymes or chemicals combined with high-throughput sequencing has opened a new field that greatly expands the amount of in vitro and in vivo RNA structural information available. The resultant datasets provide the opportunity to investigate RNA structural information on a global scale. However, the analysis of high-throughput RNA structure profiling data requires considerable computational effort and expertise. We present a new platform, StructureFold, that provides an integrated computational solution designed specifically for large-scale RNA structure mapping and reconstruction across any transcriptome. StructureFold automates the processing and analysis of raw high-throughput RNA structure profiling data, allowing the seamless incorporation of wet-bench structural information from chemical probes and/or ribonucleases to restrain RNA secondary structure prediction via the RNAstructure and ViennaRNA package algorithms. StructureFold performs reads mapping and alignment, normalization and reactivity derivation, and RNA structure prediction in a single user-friendly web interface or via local installation. The variation in transcript abundance and length that prevails in living cells and consequently causes variation in the counts of structure-probing events between transcripts is accounted for. Accordingly, StructureFold is applicable to RNA structural profiling data obtained in vivo as well as to in vitro or in silico datasets. StructureFold is deployed via the Galaxy platform. StructureFold is freely available as a component of Galaxy available at: https://usegalaxy.org/. yxt148@psu.edu or sma3@psu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights

  4. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    Science.gov (United States)

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  5. The changing face of glucagon fibrillation: Structural polymorphism and conformational imprinting

    DEFF Research Database (Denmark)

    Pedersen, J.S.; Dikov, D.; Flink, J.L.

    2006-01-01

    concentration) appear less thermostable than those formed under more challenging conditions (high temperatures, low glucagon or low salt concentrations). Properties of preformed fibrils used for seeding are inherited in a prion-like manner. Thus, we conclude that the structure of fibrils formed by glucagon...

  6. Single nucleotide polymorphism (SNP discovery in duplicated genomes: intron-primed exon-crossing (IPEC as a strategy for avoiding amplification of duplicated loci in Atlantic salmon (Salmo salar and other salmonid fishes

    Directory of Open Access Journals (Sweden)

    Primmer Craig R

    2006-07-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs represent the most abundant type of DNA variation in the vertebrate genome, and their applications as genetic markers in numerous studies of molecular ecology and conservation of natural populations are emerging. Recent large-scale sequencing projects in several fish species have provided a vast amount of data in public databases, which can be utilized in novel SNP discovery in salmonids. However, the suggested duplicated nature of the salmonid genome may hamper SNP characterization if the primers designed in conserved gene regions amplify multiple loci. Results Here we introduce a new intron-primed exon-crossing (IPEC method in an attempt to overcome this duplication problem, and also evaluate different priming methods for SNP discovery in Atlantic salmon (Salmo salar and other salmonids. A total of 69 loci with differing priming strategies were screened in S. salar, and 27 of these produced ~13 kb of high-quality sequence data consisting of 19 SNPs or indels (one per 680 bp. The SNP frequency and the overall nucleotide diversity (3.99 × 10-4 in S. salar was lower than reported in a majority of other organisms, which may suggest a relative young population history for Atlantic salmon. A subset of primers used in cross-species analyses revealed considerable variation in the SNP frequencies and nucleotide diversities in other salmonids. Conclusion Sequencing success was significantly higher with the new IPEC primers; thus the total number of loci to screen in order to identify one potential polymorphic site was six times less with this new strategy. Given that duplication may hamper SNP discovery in some species, the IPEC method reported here is an alternative way of identifying novel polymorphisms in such cases.

  7. Complete sequence and structure of the mitochondrial genome of the human tapeworm, Taenia asiatica (Platyhelminthes; Cestoda).

    Science.gov (United States)

    Jeon, H K; Lee, K H; Kim, K H; Hwang, U W; Eom, K S

    2005-06-01

    The complete Taenia asiatica mitochondrial genome was amplified by long extension polymerase chain reaction (long PCR) to yield overlapping fragments that were then completely sequenced. The whole mitochondrial genome was 13 703 bp long and contained 12 protein-encoding, 2 ribosomal RNA (small and large subunits), 22 transfer RNA genes and a short non-coding region. Thus, its gene contents are like those typically found in metazoan animal mitochondrial genomes (apart from the absence of atp8). All the genes were transcribed from the same strand. The 3' end 34 bp region of nad4L overlapped with the 5' end portion of nad4. The tRNA genes were 61-69 bp long, and the secondary structures of 18 tRNAs had typical clover-leaf shapes with paired DHU arms. However, trnC, trnS1, trnS2 and trnR had unpaired DHU arms that were 7-12 bp in length. The tRNAs that transferred serine lacked a DHU arm, as is also observed in a number of parasitic platyhelminths and metazoans. However, the trematode trnRs have paired DHU arms. The T. asiatica mtDNA non-coding region was like that in other cestodes since it was composed of a short non-coding region of 72 nucleotides and a long non-coding region of 176 nucleotides separated by a trnL1/, trnS2/, trnL2/, trnR/, nad5 gene cluster. The sequences of the cox1 genes between T. asiatica and T. saginata differ by 4.6%, while the T. asiatica cob gene differs by 4.1% and 12.9% from the cob genes of T. saginata and T. solium, respectively. In conclusion, the T. asiatica mitocondrial genome should provide a resource for comparative mitochondrial genomics and systematic studies of parasitic cestodes.

  8. The Diversity, Structure, and Function of Heritable Adaptive Immunity Sequences in the Aedes aegypti Genome.

    Science.gov (United States)

    Whitfield, Zachary J; Dolan, Patrick T; Kunitomi, Mark; Tassetto, Michel; Seetin, Matthew G; Oh, Steve; Heiner, Cheryl; Paxinos, Ellen; Andino, Raul

    2017-11-20

    The Aedes aegypti mosquito transmits arboviruses, including dengue, chikungunya, and Zika virus. Understanding the mechanisms underlying mosquito immunity could provide new tools to control arbovirus spread. Insects exploit two different RNAi pathways to combat viral and transposon infection: short interfering RNAs (siRNAs) and PIWI-interacting RNAs (piRNAs) [1, 2]. Endogenous viral elements (EVEs) are sequences from non-retroviral viruses that are inserted into the mosquito genome and can act as templates for the production of piRNAs [3, 4]. EVEs therefore represent a record of past infections and a reservoir of potential immune memory [5]. The large-scale organization of EVEs has been difficult to resolve with short-read sequencing because they tend to integrate into repetitive regions of the genome. To define the diversity, organization, and function of EVEs, we took advantage of the contiguity associated with long-read sequencing to generate a high-quality assembly of the Ae. aegypti-derived Aag2 cell line genome, an important and widely used model system. We show EVEs are acquired through recombination with specific classes of long terminal repeat (LTR) retrotransposons and organize into large loci (>50 kbp) characterized by high LTR density. These EVE-containing loci have increased density of piRNAs compared to similar regions without EVEs. Furthermore, we detected EVE-derived piRNAs consistent with a targeted processing of persistently infecting virus genomes. We propose that comparisons of EVEs across mosquito populations may explain differences in vector competence, and further study of the structure and function of these elements in the genome of mosquitoes may lead to epidemiological interventions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Aphis Glycines Virus 2, a Novel Insect Virus with a Unique Genome Structure

    Directory of Open Access Journals (Sweden)

    Sijun Liu

    2016-11-01

    Full Text Available The invasive soybean aphid, Aphis glycines, is a major pest in soybeans, resulting in substantial economic loss. We analyzed the A. glycines transcriptome to identify sequences derived from viruses of A. glycines. We identified sequences derived from a novel virus named Aphis glycines virus 2 (ApGlV2. The assembled virus genome sequence was confirmed by reverse transcription polymerase chain reaction (RT-PCR and Sanger sequencing, conserved domains were characterized, and distribution, and transmission examined. This virus has a positive sense, single-stranded RNA genome of ~4850 nt that encodes three proteins. The RNA-dependent RNA polymerase (RdRp of ApGlV2 is a permuted RdRp similar to those of some tetraviruses, while the capsid protein is structurally similar to the capsid proteins of plant sobemoviruses. ApGlV2 also encodes a larger minor capsid protein, which is translated by a readthrough mechanism. ApGlV2 appears to be widespread in A. glycines populations and to persistently infect aphids with a 100% vertical transmission rate. ApGlV2 is susceptible to the antiviral RNA interference (RNAi pathway. This virus, with its unique genome structure with both plant- and insect-virus characteristics, is of particular interest from an evolutionary standpoint.

  10. Structural polymorphism of human islet amyloid polypeptide (hIAPP) oligomers highlights the importance of interfacial residue interactions.

    Science.gov (United States)

    Zhao, Jun; Yu, Xiang; Liang, Guizhao; Zheng, Jie

    2011-01-10

    A 37-residue of human islet amyloid polypeptide (hIAPP or amylin) is a main component of amyloid plaques found in the pancreas of ∼90% of type II diabetes patients. It is reported that hIAPP oligomers, rather than mature fibrils, are major toxic species responsible for pancreatic islet β-cell dysfunction and even cell death, but molecular structures of these oligomers remain elusive. In this work, on the basis of recent solid-state NMR and mass-per-length (MPL) data, we model a series of hIAPP oligomers with different β-layers (one, two, and three layers), symmetries (symmetry and asymmetry), and associated interfaces using molecular dynamics simulations. Three distinct interfaces formed by C-terminal β-sheet and C-terminal β-sheet (CC), N-terminal β-sheet and N-terminal β-sheet (NN), and C-terminal β-sheet and N-terminal β-sheet (CN) are identified to drive multiple cross-β-layers laterally associated together to form different amyloid organizations via different intermolecular interactions, in which the CC interface is dominated by polar interactions, the NN interface is dominated by hydrophobic interactions, and the CN interface is dominated by mixed polar and hydrophobic interactions. Overall, the structural stability of the proposed hIAPP oligomers is a result of delicate balance between maximization of favorable peptide-peptide interactions at the interfaces and optimization of solvation energy with globular structure. Different hIAPP oligomeric models indicate a general and intrinsic nature of amyloid polymorphism, driven by different interfacial side-chain interactions. The proposed models are compatible with recent experimental data in overall size, cross-section area, and molecular weight. A general hIAPP aggregation mechanism is proposed on the basis of our simulated models and experimental data.

  11. Genomic diversity and affinities in population groups of North West India: an analysis of Alu insertion and a single nucleotide polymorphism.

    Science.gov (United States)

    Saini, J S; Kumar, A; Matharoo, K; Sokhi, J; Badaruddoza; Bhanwer, A J S

    2012-12-15

    The North West region of India is extremely important to understand the peopling of India, as it acted as a corridor to the foreign invaders from Eurasia and Central Asia. A series of these invasions along with multiple migrations led to intermixture of variable populations, strongly contributing to genetic variations. The present investigation was designed to explore the genetic diversities and affinities among the five major ethnic groups from North West India; Brahmin, Jat Sikh, Bania, Rajput and Gujjar. A total of 327 individuals of the abovementioned ethnic groups were analyzed for 4 Alu insertion marker loci (ACE, PV92, APO and D1) and a Single Nucleotide Polymorphism (SNP) rs2234693 in the intronic region of the ESR1 gene. Statistical analysis was performed to interpret the genetic structure and diversity of the population groups. Genotypes for ACE, APO, ESR1 and PV92 loci were found to be in Hardy-Weinberg equilibrium in all the ethnic groups, while significant departures were observed at the D1 locus in every investigated population after Bonferroni's correction. The average heterozygosity for all the loci in these ethnic groups was fairly substantial ranging from 0.3927 ± 0.1877 to 0.4333 ± 0.1416. Inbreeding coefficient indicated an overall 10% decrease in heterozygosity in these North West Indian populations. The gene differentiation among the populations was observed to be of the order of 0.013. Genetic distance estimates revealed that Gujjars were close to Banias and Jat Sikhs were close to Rajputs. Overall the study favored the recent division of the populations of North West India into largely endogamous groups. It was observed that the populations of North West India represent a more or less homogenous genetic entity, owing to their common ancestral history as well as geographical proximity. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Polymorphism of the thrombostasin gene in the horn fly (Haematobia irritans) revealed in a cDNA library and in genomic DNA.

    Science.gov (United States)

    Zhang, D; Cupp, M S; Cupp, E W

    2001-10-01

    Thrombostasin (TS) is a newly described thrombin-inhibiting protein isolated from the saliva of the horn fly (Haematobia irritans), a blood-sucking ectoparasite of cattle. This report provides a detailed characterization of the TS gene and the first analysis of the allelic complexity of a gene for an anti-hemostatic protein from a blood-feeding insect. Multiple point mutations at fixed positions in the TS gene were identified in a cDNA library prepared from mRNA isolated from horn fly salivary glands. When translated, the variant mRNAs would specify five biochemically active peptides that differ in molecular weight, isoelectric point and predicted secondary structure. Allelic variation with the same mutation pattern was revealed in the genomes of individual flies collected in the field and sampled from a long-standing laboratory colony. Approximately 60% of flies examined carried heterozygous alleles, including five additional alleles not found in the cDNA library. Comparative analysis of the allelic mutations and the predicted effects on secondary structures of the active proteins produced suggest that the TS gene may be undergoing evolutionary selection.

  13. Extensive loss of translational genes in the structurally dynamic mitochondrial genome of the angiosperm Silene latifolia

    Directory of Open Access Journals (Sweden)

    Sloan Daniel B

    2010-09-01

    Full Text Available Abstract Background Mitochondrial gene loss and functional transfer to the nucleus is an ongoing process in many lineages of plants, resulting in substantial variation across species in mitochondrial gene content. The Caryophyllaceae represents one lineage that has experienced a particularly high rate of mitochondrial gene loss relative to other angiosperms. Results In this study, we report the first complete mitochondrial genome sequence from a member of this family, Silene latifolia. The genome can be mapped as a 253,413 bp circle, but its structure is complicated by a large repeated region that is present in 6 copies. Active recombination among these copies produces a suite of alternative genome configurations that appear to be at or near "recombinational equilibrium". The genome contains the fewest genes of any angiosperm mitochondrial genome sequenced to date, with intact copies of only 25 of the 41 protein genes inferred to be present in the common ancestor of angiosperms. As observed more broadly in angiosperms, ribosomal proteins have been especially prone to gene loss in the S. latifolia lineage. The genome has also experienced a major reduction in tRNA gene content, including loss of functional tRNAs of both native and chloroplast origin. Even assuming expanded wobble-pairing rules, the mitochondrial genome can support translation of only 17 of the 61 sense codons, which code for only 9 of the 20 amino acids. In addition, genes encoding 18S and, especially, 5S rRNA exhibit exceptional sequence divergence relative to other plants. Divergence in one region of 18S rRNA appears to be the result of a gene conversion event, in which recombination with a homologous gene of chloroplast origin led to the complete replacement of a helix in this ribosomal RNA. Conclusions These findings suggest a markedly expanded role for nuclear gene products in the translation of mitochondrial genes in S. latifolia and raise the possibility of altered

  14. Maintenance of genome stability in plants: repairing DNA double strand breaks and chromatin structure stability

    Directory of Open Access Journals (Sweden)

    Sujit eRoy

    2014-09-01

    Full Text Available Plant cells are subject to high levels of DNA damage resulting from plant’s obligatory dependence on sunlight and the associated exposure to environmental stresses like solar UV radiation, high soil salinity, drought, chilling injury and other air and soil pollutants including heavy metals and metabolic byproducts from endogenous processes. The irreversible DNA damages, generated by the environmental and genotoxic stresses affect plant growth and development, reproduction and crop productivity. Thus, for maintaining genome stability, plants have developed an extensive array of mechanisms for the detection and repair of DNA damages. This review will focus recent advances in our understanding of mechanisms regulating plant genome stability in the context of repairing of double stand breaks and chromatin structure maintenance.

  15. Structure and mechanism of the ATPase that powers viral genome packaging.

    Science.gov (United States)

    Hilbert, Brendan J; Hayes, Janelle A; Stone, Nicholas P; Duffy, Caroline M; Sankaran, Banumathi; Kelch, Brian A

    2015-07-21

    Many viruses package their genomes into procapsids using an ATPase machine that is among the most powerful known biological motors. However, how this motor couples ATP hydrolysis to DNA translocation is still unknown. Here, we introduce a model system with unique properties for studying motor structure and mechanism. We describe crystal structures of the packaging motor ATPase domain that exhibit nucleotide-dependent conformational changes involving a large rotation of an entire subdomain. We also identify the arginine finger residue that catalyzes ATP hydrolysis in a neighboring motor subunit, illustrating that previous models for motor structure need revision. Our findings allow us to derive a structural model for the motor ring, which we validate using small-angle X-ray scattering and comparisons with previously published data. We illustrate the model's predictive power by identifying the motor's DNA-binding and assembly motifs. Finally, we integrate our results to propose a mechanistic model for DNA translocation by this molecular machine.

  16. Characterizing the population structure and genetic diversity of maize breeding germplasm in Southwest China using genome-wide SNP markers.

    Science.gov (United States)

    Zhang, Xiao; Zhang, Hua; Li, Lujiang; Lan, Hai; Ren, Zhiyong; Liu, Dan; Wu, Ling; Liu, Hailan; Jaqueth, Jennifer; Li, Bailin; Pan, Guangtang; Gao, Shibin

    2016-08-31

    Maize breeding germplasm used in Southwest China has high complexity because of the diverse ecological features of this area. In this study, the population structure, genetic diversity, and linkage disequilibrium decay distance of 362 important inbred lines collected from the breeding program of Southwest China were characterized using the MaizeSNP50 BeadChip with 56,110 single nucleotide polymorphisms (SNPs). With respect to population structure, two (Tropical and Temperate), three (Tropical, Stiff Stalk and non-Stiff Stalk), four [Tropical, group A germplasm derived from modern U.S. hybrids (PA), group B germplasm derived from modern U.S. hybrids (PB) and Reid] and six (Tropical, PB, Reid, Iowa Stiff Stalk Synthetic, PA and North) subgroups were identified. With increasing K value, the Temperate group showed pronounced hierarchical structure with division into further subgroups. The Genetic Diversity of each group was also estimated, and the Tropical group was more diverse than the Temperate group. Seven low-genetic-diversity and one high-genetic-diversity regions were collectively identified in the Temperate, Tropical groups, and the entire panel. SNPs with significant variation in allele frequency between the Tropical and Temperate groups were also evaluated. Among them, a region located at 130 Mb on Chromosome 2 showed the highest genetic diversity, including both number of SNPs with significant variation and the ratio of significant SNPs to total SNPs. Linkage disequilibrium decay distance in the Temperate group was greater (2.5-3 Mb) than that in the entire panel (0.5-0.75 Mb) and the Tropical group (0.25-0.5 Mb). A large region at 30-120 Mb of Chromosome 7 was concluded to be a region conserved during the breeding process by comparison between S37, which was considered a representative tropical line in Southwest China, and its 30 most similar derived lines. For the panel covered most of widely used inbred lines in Southwest China, this work

  17. In Situ Polymorphic Alteration of Filler Structures for Biomimetic Mechanically Adaptive Elastomer Nanocomposites.

    Science.gov (United States)

    Natarajan, Tamil Selvan; Okamoto, Shigeru; Stöckelhuber, Klaus Werner; Wießner, Sven; Reuter, Uta; Fischer, Dieter; Ghosh, Anik Kumar; Heinrich, Gert; Das, Amit

    2018-04-30

    A mechanically adaptable elastomer composite is prepared with reversible soft-stiff properties that can be easily controlled. By the exploitation of different morphological structures of calcium sulfate, which acts as the active filler in a soft elastomer matrix, the magnitude of filler reinforcement can be reversibly altered, which will be reflected in changes of the final stiffness of the material. The higher stiffness, in other words, the higher modulus of the composites, is realized by the in situ development of fine nanostructured calcium sulfate dihydrate crystals, which are formed during exposure to water and, further, these highly reinforcing crystals can be transformed to a nonreinforcing hemihydrate mesocrystalline structure by simply heating the system in a controlled way. The Young's modulus of the developed material can be reversibly altered from ∼6 to ∼17 MPa, and the dynamic stiffness (storage modulus at room temperature and 10 Hz frequency) alters its value in the order of 1000%. As the transformation is related to the presence of water molecules in the crystallites, a hydrophilic elastomer matrix was selected, which is a blend of two hydrophilic polymers, namely, epichlorohydrin-ethylene oxide-allyl glycidyl ether terpolymer and a terpolymer of ethylene oxide-propylene oxide-allyl glycidyl ether. For the first time, this method also provides a route to regulate the morphology and structure of calcium sulfate nanocrystals in a confined ambient of cross-linked polymer chains.

  18. Polymorph identification and crystal structure determination by a combined crystal structure prediction and transmission electron microscopy approach.

    Science.gov (United States)

    Eddleston, Mark D; Hejczyk, Katarzyna E; Bithell, Erica G; Day, Graeme M; Jones, William

    2013-06-10

    Electron diffraction offers advantages over X-ray based methods for crystal structure determination because it can be applied to sub-micron sized crystallites, and picogram quantities of material. For molecular organic species, however, crystal structure determination with electron diffraction is hindered by rapid crystal deterioration in the electron beam, limiting the amount of diffraction data that can be collected, and by the effect of dynamical scattering on reflection intensities. Automated electron diffraction tomography provides one possible solution. We demonstrate here, however, an alternative approach in which a set of putative crystal structures of the compound of interest is generated by crystal structure prediction methods and electron diffraction is used to determine which of these putative structures is experimentally observed. This approach enables the advantages of electron diffraction to be exploited, while avoiding the need to obtain large amounts of diffraction data or accurate reflection intensities. We demonstrate the application of the methodology to the pharmaceutical compounds paracetamol, scyllo-inositol and theophylline. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. New high-pressure polymorph of In2S3 with defect Th3P4-type structure

    Science.gov (United States)

    Lai, Xiaojing; Zhu, Feng; Wu, Ye; Huang, Rong; Wu, Xiang; Zhang, Qian; Yang, Ke; Qin, Shan

    2014-02-01

    The high pressure behavior of β-In2S3 (I41/amd and Z=16) has been studied by in situ synchrotron radiation X-ray diffraction combined with diamond anvil cell up to 71.7 GPa. Three pressure-induced phase transitions are evidenced at ~6.6 GPa, ~11.1 GPa at room temperature and 35.6 GPa after the high-temperature annealing using a portable laser heating system. The new polymorph of In2S3 at 35.6 GPa is assigned to the denser cubic defect Th3P4 structure (I4bar3d and Z=5.333), whose unit-cell parameters are a=7.557(1) Å and V=431.6(2) Å3. The Th3P4-type phase can be stable at least up to 71.7 GPa and cannot be preserved at ambient pressure. The pressure-volume relationship is well described by the second-order Birch-Murnaghan Equation of State, which yields B0=63(3) GPa and B0‧=4 (fixed) for the β-In2S3 phase and B0=87(3) GPa and B0‧=4 (fixed) for the defect Th3P4-type phase respectively.

  20. Genetic structure is correlated with phenotypic divergence rather than geographic isolation in the highly polymorphic strawberry poison-dart frog.

    Science.gov (United States)

    Wang, Ian J; Summers, Kyle

    2010-02-01

    Phenotypic and genetic divergence can be influenced by a variety of factors, including sexual and natural selection, genetic drift and geographic isolation. Investigating the roles of these factors in natural systems can provide insight into the relative influences of allopatric and ecological modes of biological diversification in nature. The strawberry poison frog, Dendrobates pumilio, presents an excellent opportunity for this kind of research, displaying a diverse array of colour morphs and inhabiting a heterogeneous landscape that includes oceanic islands, fragmented rainforest patches and wide expanses of suitable habitat. In this study, we use 15 highly polymorphic microsatellite loci to estimate population structure and gene flow among populations from across the range of D. pumilio and a causal modelling framework to statistically test 12 hypotheses regarding the geographic and phenotypic variables that explain genetic differentiation within this system. Our results demonstrate that the genetic distance between populations is most strongly associated with differences in dorsal coloration. Previous experimental studies have shown that phenotypic differences can result in sexual and natural selection against non-native phenotypes, and our results now show that these forces lead to genetic isolation between different colour morphs in the wild, presenting a potential case of incipient speciation through selection.

  1. Computational Analysis of Damaging Single-Nucleotide Polymorphisms and Their Structural and Functional Impact on the Insulin Receptor

    Directory of Open Access Journals (Sweden)

    Zabed Mahmud

    2016-01-01

    Full Text Available Single-nucleotide polymorphisms (SNPs associated with complex disorders can create, destroy, or modify protein coding sites. Single amino acid substitutions in the insulin receptor (INSR are the most common forms of genetic variations that account for various diseases like Donohue syndrome or Leprechaunism, Rabson-Mendenhall syndrome, and type A insulin resistance. We analyzed the deleterious nonsynonymous SNPs (nsSNPs in INSR gene based on different computational methods. Analysis of INSR was initiated with PROVEAN followed by PolyPhen and I-Mutant servers to investigate the effects of 57 nsSNPs retrieved from database of SNP (dbSNP. A total of 18 mutations that were found to exert damaging effects on the INSR protein structure and function were chosen for further analysis. Among these mutations, our computational analysis suggested that 13 nsSNPs decreased protein stability and might have resulted in loss of function. Therefore, the probability of their involvement in disease predisposition increases. In the lack of adequate prior reports on the possible deleterious effects of nsSNPs, we have systematically analyzed and characterized the functional variants in coding region that can alter the expression and function of INSR gene. In silico characterization of nsSNPs affecting INSR gene function can aid in better understanding of genetic differences in disease susceptibility.

  2. Crystal structure of a new monoclinic polymorph of N-(4-methyl-phen-yl)-3-nitro-pyridin-2-amine.

    Science.gov (United States)

    Aznan, Aina Mardia Akhmad; Abdullah, Zanariah; Lee, Vannajan Sanghiran; Tiekink, Edward R T

    2014-08-01

    The title compound, C12H11N3O2, is a second monoclinic polymorph (P21, with Z' = 4) of the previously reported monoclinic (P21/c, with Z' = 2) form [Akhmad Aznan et al. (2010 ▶). Acta Cryst. E66, o2400]. Four independent mol-ecules comprise the asymmetric unit, which have the common features of a syn disposition of the pyridine N atom and the toluene ring, and an intra-molecular amine-nitro N-H⋯O hydrogen bond. The differences between mol-ecules relate to the dihedral angles between the rings which range from 2.92 (19) to 26.24 (19)°. The geometry-optimized structure [B3LYP level of theory and 6-311 g+(d,p) basis set] has the same features except that the entire mol-ecule is planar. In the crystal, the three-dimensional architecture is consolidated by a combination of C-H⋯O, C-H⋯π, nitro-N-O⋯π and π-π inter-actions [inter-centroid distances = 3.649 (2)-3.916 (2) Å].

  3. Fast and accurate search for non-coding RNA pseudoknot structures in genomes.

    Science.gov (United States)

    Huang, Zhibin; Wu, Yong; Robertson, Joseph; Feng, Liang; Malmberg, Russell L; Cai, Liming

    2008-10-15

    Searching genomes for non-coding RNAs (ncRNAs) by their secondary structure has become an important goal for bioinformatics. For pseudoknot-free structures, ncRNA search can be effective based on the covariance model and CYK-type dynamic programming. However, the computational difficulty in aligning an RNA sequence to a pseudoknot has prohibited fast and accurate search of arbitrary RNA structures. Our previous work introduced a graph model for RNA pseudoknots and proposed to solve the structure-sequence alignment by graph optimization. Given k candidate regions in the target sequence for each of the n stems in the structure, we could compute a best alignment in time O(k(t)n) based upon a tree width t decomposition of the structure graph. However, to implement this method to programs that can routinely perform fast yet accurate RNA pseudoknot searches, we need novel heuristics to ensure that, without degrading the accuracy, only a small number of stem candidates need to be examined and a tree decomposition of a small tree width can always be found for the structure graph. The current work builds on the previous one with newly developed preprocessing algorithms to reduce the values for parameters k and t and to implement the search method into a practical program, called RNATOPS, for RNA pseudoknot search. In particular, we introduce techniques, based on probabilistic profiling and distance penalty functions, which can identify for every stem just a small number k (e.g. k algorithm that can yield tree decomposition of small tree width t (e.g. t search prokaryotic and eukaryotic genomes for specific RNA structures of medium to large sizes, including pseudoknots, with high sensitivity and high specificity, and in a reasonable amount of time.

  4. Structural genomic alterations in primary mediastinal large B-cell lymphoma.

    Science.gov (United States)

    Twa, David D W; Steidl, Christian

    2015-01-01

    Primary mediastinal large B-cell lymphoma (PMBCL) is an aggressive non-Hodgkin lymphoma that displays phenotypic and genotypic similarity to Hodgkin lymphoma and diffuse large B-cell lymphoma. Studies using genome-wide discovery tools have revealed specific, recurrent structural aberrations as critical somatic events in the pathogenesis of PMBCL. These structural alterations prominently include transcript and protein altering rearrangements and copy number variations of the programmed death ligands 1 (CD274) and 2 (PDCD1LG2), CIITA, JAK2 and REL. Importantly, evidence is emerging that these acquired structural genomic changes, in synergy with other somatic alterations, contribute to PMBCL pathogenesis by influencing tumor microenvironment interactions that favor malignant B-cell growth. The means by which these rearrangements arise are not well understood. However, analysis of breakpoint junctions at base-pair resolution provides preliminary insight into putative rearrangement mechanisms. As the field also anticipates predictive value and therapeutic targeting of structural changes involving programmed death ligands and JAK2, a review of therapies that will likely shape future lymphoma treatment is needed.

  5. Probing Retroviral and Retrotransposon Genome Structures: The “SHAPE” of Things to Come

    Directory of Open Access Journals (Sweden)

    Joanna Sztuba-Solinska

    2012-01-01

    Full Text Available Understanding the nuances of RNA structure as they pertain to biological function remains a formidable challenge for retrovirus research and development of RNA-based therapeutics, an area of particular importance with respect to combating HIV infection. Although a variety of chemical and enzymatic RNA probing techniques have been successfully employed for more than 30 years, they primarily interrogate small (100–500 nt RNAs that have been removed from their biological context, potentially eliminating long-range tertiary interactions (such as kissing loops and pseudoknots that may play a critical regulatory role. Selective 2′ hydroxyl acylation analyzed by primer extension (SHAPE, pioneered recently by Merino and colleagues, represents a facile, user-friendly technology capable of interrogating RNA structure with a single reagent and, combined with automated capillary electrophoresis, can analyze an entire 10,000-nucleotide RNA genome in a matter of weeks. Despite these obvious advantages, SHAPE essentially provides a nucleotide “connectivity map,” conversion of which into a 3-D structure requires a variety of complementary approaches. This paper summarizes contributions from SHAPE towards our understanding of the structure of retroviral genomes, modifications to which technology that have been developed to address some of its limitations, and future challenges.

  6. Damming the genomic data flood using a comprehensive analysis and storage data structure.

    Science.gov (United States)

    Bouffard, Marc; Phillips, Michael S; Brown, Andrew M K; Marsh, Sharon; Tardif, Jean-Claude; van Rooij, Tibor

    2010-01-01

    Data generation, driven by rapid advances in genomic technologies, is fast outpacing our analysis capabilities. Faced with this flood of data, more hardware and software resources are added to accommodate data sets whose structure has not specifically been designed for analysis. This leads to unnecessarily lengthy processing times and excessive data handling and storage costs. Current efforts to address this have centered on developing new indexing schemas and analysis algorithms, whereas the root of the problem lies in the format of the data itself. We have developed a new data structure for storing and analyzing genotype and phenotype data. By leveraging data normalization techniques, database management system capabilities and the use of a novel multi-table, multidimensional database structure we have eliminated the following: (i) unnecessarily large data set size due to high levels of redundancy, (ii) sequential access to these data sets and (iii) common bottlenecks in analysis times. The resulting novel data structure horizontally divides the data to circumvent traditional problems associated with the use of databases for very large genomic data sets. The resulting data set required 86% less disk space and performed analytical calculations 6248 times faster compared to a standard approach without any loss of information. Database URL: http://castor.pharmacogenomics.ca.

  7. Novel insights through the integration of structural and functional genomics data with protein networks.

    Science.gov (United States)

    Clarke, Declan; Bhardwaj, Nitin; Gerstein, Mark B

    2012-09-01

    In recent years, major advances in genomics, proteomics, macromolecular structure determination, and the computational resources capable of processing and disseminating the large volumes of data generated by each have played major roles in advancing a more systems-oriented appreciation of biological organization. One product of systems biology has been the delineation of graph models for describing genome-wide protein-protein interaction networks. The network organization and topology which emerges in such models may be used to address fundamental questions in an array of cellular processes, as well as biological features intrinsic to the constituent proteins (or "nodes") themselves. However, graph models alone constitute an abstraction which neglects the underlying biological and physical reality that the network's nodes and edges are highly heterogeneous entities. Here, we explore some of the advantages of introducing a protein structural dimension to such models, as the marriage of conventional network representations with macromolecular structural data helps to place static node and edge constructs in a biologically more meaningful context. We emphasize that 3D protein structures constitute a valuable conceptual and predictive framework by discussing examples of the insights provided, such as enabling in silico predictions of protein-protein interactions, providing rational and compelling classification schemes for network elements, as well as revealing interesting intrinsic differences between distinct node types, such as disorder and evolutionary features, which may then be rationalized in light of their respective functions within networks. Copyright © 2012 Elsevier Inc. All rights reserved.

  8. Role of sequence and structural polymorphism on the mechanical properties of amyloid fibrils.

    Directory of Open Access Journals (Sweden)

    Gwonchan Yoon

    Full Text Available Amyloid fibrils playing a critical role in disease expression, have recently been found to exhibit the excellent mechanical properties such as elastic modulus in the order of 10 GPa, which is comparable to that of other mechanical proteins such as microtubule, actin filament, and spider silk. These remarkable mechanical properties of amyloid fibrils are correlated with their functional role in disease expression. This suggests the importance in understanding how these excellent mechanical properties are originated through self-assembly process that may depend on the amino acid sequence. However, the sequence-structure-property relationship of amyloid fibrils has not been fully understood yet. In this work, we characterize the mechanical properties of human islet amyloid polypeptide (hIAPP fibrils with respect to their molecular structures as well as their amino acid sequence by using all-atom explicit water molecular dynamics (MD simulation. The simulation result suggests that the remarkable bending rigidity of amyloid fibrils can be achieved through a specific self-aggregation pattern such as antiparallel stacking of β strands (peptide chain. Moreover, we have shown that a single point mutation of hIAPP chain constituting a hIAPP fibril significantly affects the thermodynamic stability of hIAPP fibril formed by parallel stacking of peptide chain, and that a single point mutation results in a significant change in the bending rigidity of hIAPP fibrils formed by antiparallel stacking of β strands. This clearly elucidates the role of amino acid sequence on not only the equilibrium conformations of amyloid fibrils but also their mechanical properties. Our study sheds light on sequence-structure-property relationships of amyloid fibrils, which suggests that the mechanical properties of amyloid fibrils are encoded in their sequence-dependent molecular architecture.

  9. Genome-wide assessment of population structure and genetic diversity and development of a core germplasm set for sweet potato based on specific length amplified fragment (SLAF) sequencing.

    Science.gov (United States)

    Su, Wenjin; Wang, Lianjun; Lei, Jian; Chai, Shasha; Liu, Yi; Yang, Yuanyuan; Yang, Xinsun; Jiao, Chunhai

    2017-01-01

    Sweet potato, Ipomoea batatas (L.) Lam., is an important food crop that is cultivated worldwide. However, no genome-wide assessment of the genetic diversity of sweet potato has been reported to date. In the present study, the population structure and genetic diversity of 197 sweet potato accessions most of which were from China were assessed using 62,363 SNPs. A model-based structure analysis divided the accessions into three groups: group 1, group 2 and group 3. The genetic relationships among the accessions were evaluated using a phylogenetic tree, which clustered all the accessions into three major groups. A principal component analysis (PCA) showed that the accessions were distributed according to their population structure. The mean genetic distance among accessions ranged from 0.290 for group 1 to 0.311 for group 3, and the mean polymorphic information content (PIC) ranged from 0.232 for group 1 to 0.251 for group 3. The mean minor allele frequency (MAF) ranged from 0.207 for group 1 to 0.222 for group 3. Analysis of molecular variance (AMOVA) showed that the maximum diversity was within accessions (89.569%). Using CoreHunter software, a core set of 39 accessions was obtained, which accounted for approximately 19.8% of the total collection. The core germplasm set of sweet potato developed will be a valuable resource for future sweet potato improvement strategies.

  10. Population Structure and Genomic Breed Composition in an Angus–Brahman Crossbred Cattle Population

    Directory of Open Access Journals (Sweden)

    Mesfin Gobena

    2018-03-01

    Full Text Available Crossbreeding is a common strategy used in tropical and subtropical regions to enhance beef production, and having accurate knowledge of breed composition is essential for the success of a crossbreeding program. Although pedigree records have been traditionally used to obtain the breed composition of crossbred cattle, the accuracy of pedigree-based breed composition can be reduced by inaccurate and/or incomplete records and Mendelian sampling. Breed composition estimation from genomic data has multiple advantages including higher accuracy without being affected by missing, incomplete, or inaccurate records and the ability to be used as independent authentication of breed in breed-labeled beef products. The present study was conducted with 676 Angus–Brahman crossbred cattle with genotype and pedigree information to evaluate the feasibility and accuracy of using genomic data to determine breed composition. We used genomic data in parametric and non-parametric methods to detect population structure due to differences in breed composition while accounting for the confounding effect of close familial relationships. By applying principal component analysis (PCA and the maximum likelihood method of ADMIXTURE to genomic data, it was possible to successfully characterize population structure resulting from heterogeneous breed ancestry, while accounting for close familial relationships. PCA results offered additional insight into the different hierarchies of genetic variation structuring. The first principal component was strongly correlated with Angus–Brahman proportions, and the second represented variation within animals that have a relatively more extended Brangus lineage—indicating the presence of a distinct pattern of genetic variation in these cattle. Although there was strong agreement between breed proportions estimated from pedigree and genetic information, there were significant discrepancies between these two methods for certain animals

  11. The effect of molecular mass on the polymorphism and crystalline structure of isotactic polypropylene

    Directory of Open Access Journals (Sweden)

    2010-02-01

    Full Text Available This study is devoted to the investigation of the effect of molecular mass on the α-, β- and γ-crystallization tendency of isotactic polypropylene (iPP. The crystalline structure was studied by wide angle X-ray scattering (WAXS and by polarised light microscopy (PLM. The melting and crystallization characteristics were determined by differential scanning calorimetry (DSC. The results indicate clearly that iPP with low molecular mass crystallizes essentially in α-modification. However, it crystallizes in β-form in the presence of a highly efficient and selective β-nucleating agent. The α- and β-modifications form in wide molecular mass range. The decreasing molecular mass results in increased structural instability in both α- and β-modifications and consequently enhanced inclination to recrystallization during heating. The formation of γ-modification could not be observed, although some literature sources report that γ-form develops in iPP with low molecular mass.

  12. The Global Cancer Genomics Consortium: interfacing genomics and cancer medicine.

    Science.gov (United States)

    2012-08-01

    The Global Cancer Genomics Consortium (GCGC) is an international collaborative platform that amalgamates cancer biologists, cutting-edge genomics, and high-throughput expertise with medical oncologists and surgical oncologists; they address the most important translational questions that are central to cancer research and treatment. The annual GCGC symposium was held at the Advanced Centre for Treatment Research and Education in Cancer, Mumbai, India, from November 9 to 11, 2011. The symposium showcased international next-generation sequencing efforts that explore cancer-specific transcriptomic changes, single-nucleotide polymorphism, and copy number variations in various types of cancers, as well as the structural genomics approach to develop new therapeutic targets and chemical probes. From the spectrum of studies presented at the symposium, it is evident that the translation of emerging cancer genomics knowledge into clinical applications can only be achieved through the integration of multidisciplinary expertise. In summary, the GCGC symposium provided practical knowledge on structural and cancer genomics approaches, as well as an exclusive platform for focused cancer genomics endeavors. ©2012 AACR.

  13. Whole genome PCR scanning reveals the syntenic genome structure of toxigenic Vibrio cholerae strains in the O1/O139 population.

    Directory of Open Access Journals (Sweden)

    Bo Pang

    Full Text Available Vibrio cholerae is commonly found in estuarine water systems. Toxigenic O1 and O139 V. cholerae strains have caused cholera epidemics and pandemics, whereas the nontoxigenic strains within these serogroups only occasionally lead to disease. To understand the differences in the genome and clonality between the toxigenic and nontoxigenic strains of V. cholerae serogroups O1 and O139, we employed a whole genome PCR scanning (WGPScanning method, an rrn operon-mediated fragment rearrangement analysis and comparative genomic hybridization (CGH to analyze the genome structure of different strains. WGPScanning in conjunction with CGH revealed that the genomic contents of the toxigenic strains were conservative, except for a few indels located mainly in mobile elements. Minor nucleotide variation in orthologous genes appeared to be the major difference between the toxigenic strains. rrn operon-mediated rearrangements were infrequent in El Tor toxigenic strains tested using I-CeuI digested pulsed-field gel electrophoresis (PFGE analysis and PCR analysis based on flanking sequence of rrn operons. Using these methods, we found that the genomic structures of toxigenic El Tor and O139 strains were syntenic. The nontoxigenic strains exhibited more extensive sequence variations, but toxin coregulated pilus positive (TCP+ strains had a similar structure. TCP+ nontoxigenic strains could be subdivided into multiple lineages according to the TCP type, suggesting the existence of complex intermediates in the evolution of toxigenic strains. The data indicate that toxigenic O1 El Tor and O139 strains were derived from a single lineage of intermediates from complex clones in the environment. The nontoxigenic strains with non-El Tor type TCP may yet evolve into new epidemic clones after attaining toxigenic attributes.

  14. Population genetic structure of clinical and environmental isolates of Blastomyces dermatitidis based on 27 polymorphic microsatellite markers

    Science.gov (United States)

    Meece, Jennifer K.; Anderson, Jennifer L.; Fisher, Matthew C.; Henk, Daniel A.; Sloss, Brian L.; Reed, Kurt D.

    2011-01-01

    Blastomyces dermatitidis, a thermally dimorphic fungus, is the etiologic agent of North American blastomycosis. Clinical presentation is varied, ranging from silent infections to fulminant respiratory disease and dissemination to skin and other sites. Exploration of the population genetic structure of B. dermatitidis would improve our knowledge regarding variation in virulence phenotypes, geographic distribution, and difference in host specificity. The objective of this study was to develop and test a panel of microsatellite markers to delineate the population genetic structure within a group of clinical and environmental isolates of B. dermatitidis. We developed 27 microsatellite markers and genotyped B. dermatitidis isolates from various hosts and environmental sources (n=112). Assembly of a neighbor-joining tree of allele-sharing distance revealed two genetically distinct groups, separated by a deep node. Bayesian admixture analysis showed that two populations were statistically supported. Principal coordinate analysis also reinforced support for two genetic groups, with the primary axis explaining 61.41% of the genetic variability. Group 1 isolates average 1.8 alleles/locus, whereas group 2 isolates are highly polymorphic, averaging 8.2 alleles/locus. In this data set, alleles at three loci are unshared between the two groups and appear diagnostic. The mating type of individual isolates was determined by PCR. Both mating type-specific genes, the HMG and α-box domains, were represented in each of the genetic groups, with slightly more isolates having the HMG allele. One interpretation of this study is that the species currently designated B. dermatitidis includes a cryptic subspecies or perhaps a separate species.

  15. Oxytocin receptor polymorphism and childhood social experiences shape adult personality, brain structure and neural correlates of mentalizing.

    Science.gov (United States)

    Schneider-Hassloff, H; Straube, B; Jansen, A; Nuscheler, B; Wemken, G; Witt, S H; Rietschel, M; Kircher, T

    2016-07-01

    The oxytocin system is involved in human social behavior and social cognition such as attachment, emotion recognition and mentalizing (i.e. the ability to represent mental states of oneself and others). It is shaped by social experiences in early life, especially by parent-infant interactions. The single nucleotid polymorphism rs53576 in the oxytocin receptor (OXTR) gene has been linked to social behavioral phenotypes. In 195 adult healthy subjects we investigated the interaction of OXTR rs53576 and childhood attachment security (CAS) on the personality traits "adult attachment style" and "alexithymia" (i.e. emotional self-awareness), on brain structure (voxel-based morphometry) and neural activation (fMRI) during an interactive mentalizing paradigm (prisoner's dilemma game; subgroup: n=163). We found that in GG-homozygotes, but not in A-allele carriers, insecure childhood attachment is - in adulthood - associated with a) higher attachment-related anxiety and alexithymia, b) higher brain gray matter volume of left amygdala and lower volumes in right superior parietal lobule (SPL), left temporal pole (TP), and bilateral frontal regions, and c) higher mentalizing-related neural activity in bilateral TP and precunei, and right middle and superior frontal gyri. Interaction effects of genotype and CAS on brain volume and/or function were associated with individual differences in alexithymia and attachment-related anxiety. Interactive effects were in part sexually dimorphic. The interaction of OXTR genotype and CAS modulates adult personality as well as brain structure and function of areas implicated in salience processing and mentalizing. Rs53576 GG-homozygotes are partially more susceptible to childhood attachment experiences than A-allele carriers. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Polymorphic design of DNA origami structures through mechanical control of modular components.

    Science.gov (United States)

    Lee, Chanseok; Lee, Jae Young; Kim, Do-Nyun

    2017-12-12

    Scaffolded DNA origami enables the bottom-up fabrication of diverse DNA nanostructures by designing hundreds of staple strands, comprised of complementary sequences to the specific binding locations of a scaffold strand. Despite its exceptionally high design flexibility, poor reusability of staples has been one of the major hurdles to fabricate assorted DNA constructs in an effective way. Here we provide a rational module-based design approach to create distinct bent shapes with controllable geometries and flexibilities from a single, reference set of staples. By revising the staple connectivity within the desired module, we can control the location, stiffness, and included angle of hinges precisely, enabling the construction of dozens of single- or multiple-hinge structures with the replacement of staple strands up to 12.8% only. Our design approach, combined with computational shape prediction and analysis, can provide a versatile and cost-effective procedure in the design of DNA origami shapes with stiffness-tunable units.

  17. The changing face of glucagon fibrillation: Structural polymorphism and conformational imprinting

    DEFF Research Database (Denmark)

    Pedersen, J.S.; Dikov, D.; Flink, J.L.

    2006-01-01

    We have established a time-resolved fluorescence assay to study fibrillation of the 29 residue peptide hormone glucagon under a variety of different conditions in a high-throughput format. Fibrils formed at pH 2.5 differ in fibrillation kinetics, morphology, thioflavin T staining and FTIR....../CD spectra depending on salts, glucagon concentration and fibrillation temperature. Apparent fibrillar stability correlates with spectral and kinetic properties; generally, fibrils formed under conditions favourable for rapid fibrillation (ambient temperatures, high glucagon concentration or high salt...... concentration) appear less thermostable than those formed under more challenging conditions (high temperatures, low glucagon or low salt concentrations). Properties of preformed fibrils used for seeding are inherited in a prion-like manner. Thus, we conclude that the structure of fibrils formed by glucagon...

  18. New in protein structure and function annotation: hotspots, single nucleotide polymorphisms and the 'Deep Web'.

    Science.gov (United States)

    Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard

    2009-05-01

    The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.

  19. Аldosterone synthetase gene (CYP11B2 polymorphism and structural parameters of the left ventricle in patients with coronary heart disease, postinfarction cardiosclerosis

    Directory of Open Access Journals (Sweden)

    M. N. Dolzhenko

    2017-12-01

    Full Text Available Purpose of the work – to investigate the possible contribution of aldosterone synthetase gene (CYP11B2 polymorphism to the disease course and structural parameters of LV in patients with coronary heart disease, postinfarction cardiosclerosis. Materials and мethods. General clinical examination of 100 patients with postinfarction cardiosclerosis was done at the Cardiology Department of P. L Shupyk NMAPE. Genetic testing was performed by polymerase chain reaction in real time at the Bogomolets Institute of Physiology,Kyiv,Ukraine. Exclusion criteria were hemodynamically significant valvular heart diseases, chronic obstructive pulmonary diseases, permanent or temporary heart pacing, acute heart failure and implanted cardioverter-defibrillator, permanent atrial fibrillation. Statistical analysis of the results was performed using Microsoft Excel, the statistical program SPSS (version 20, US. The results obtained are presented as M ± σ. Results. The stenosis of the left main coronary artery was observed in 25.9 % of cases in the subgroup of the TT variant. It should be noted that in the TC subgroup of aldosterone synthase gene variant polymorphism the incidence of the left main coronary artery lesion was 13.9 %. There has been no single case of left main coronary artery lesion in the SS subgroup with little statistical significance in comparison with the subgroup of TT variant of the polymorphism (P = 0.048. In the analysis of clinical data the most marked manifestations of angina pectoris were in subgroups of TT and TC – 73.3 % and 72.7 %, respectively, compared with CC subgroup – 40 %, reliable for both subgroups (P1.2 = 0.95, P1.3 = 0.039, P2.3 = 0.029. In the analysis of LV morphological characteristics the smallest indices of the LV mass have been revealed in the CC subgroup of the polymorphism variant (190.5 ± 52.1 g, compared with the LV mass values in the TT subgroup (231.00 ± 55.21 g, P = 0.03 and TC (197.421 ± 63.15, P > 0.05. A

  20. The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability.

    Science.gov (United States)

    Hamperl, Stephan; Cimprich, Karlene A

    2014-07-01

    Accurate DNA replication and DNA repair are crucial for the maintenance of genome stability, and it is generally accepted that failure of these processes is a major source of DNA damage in cells. Intriguingly, recent evidence suggests that DNA damage is more likely to occur at genomic loci with high transcriptional activity. Furthermore, loss of certain RNA processing factors in eukaryotic cells is associated with increased formation of co-transcriptional RNA:DNA hybrid structures known as R-loops, resulting in double-strand breaks (DSBs) and DNA damage. However, the molecular mechanisms by which R-loop structures ultimately lead to DNA breaks and genome instability is not well understood. In this review, we summarize the current knowledge about the formation, recognition and processing of RNA:DNA hybrids, and discuss possible mechanisms by which these structures contribute to DNA damage and genome instability in the cell. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Evidence That the Vectorial Competence of Phlebotomine Sand Flies for Different Species of leishmania is Controlled by Structural Polymorphisms in the Surface Lipophosphoglycan

    Science.gov (United States)

    1994-09-01

    phlebotomine sand flies for different species of Leishmania is controlled by structural polymorphisms in the sifrf e lipophosphoglycan PAULO F P. PIMENTAi... sand flies were reared and maintained in the were retained only in flies infected with L. major (90%),Department of Entomology. Walter Reed Army...institutn of compared with L. donovani IS (16%), L. donovani Mongi Research. Three- to 5-day-old female sand flies were fed 0 L. amazonensis t 9%), and L

  2. Structural Insight Into the Role of Mutual Polymorphism and Conservatism in the Contact Zone of the NFR5–K1 Heterodimer With the Nod Factor

    Directory of Open Access Journals (Sweden)

    A. A. Igolkina

    2018-04-01

    Full Text Available Sandwich-like docking configurations of the heterodimeric complex of NFR5 and K1 Vicia sativa receptor-like kinases together with the putative ligand, Nod factor (NF of Rhizobium leguminosarum bv. viciae, were modeled and two of the most probable configurations were assessed through the analysis of the mutual polymorphisms and conservatism. We carried out this analysis based on the hypothesis that in a contact zone of two docked components (proteins or ligands the population polymorphism or conservatism is mutual, i.e., the variation in one component has a reflected variation in the other component. The population material of 30 wild-growing V. sativa (leaf pieces was collected from a large field (uncultivated for the past 25-years and pooled; form this pool, 100 randomly selected cloned fragments of NFR5 gene and 100 of K1 gene were sequenced by the Sanger method. Congruence between population trees of NFR5 and K1 haplotypes allowed us to select two respective haplotypes, build their 3D structures, and perform protein–protein docking. In a separate simulation, the protein-ligand docking between NFR5 and NF was carried out. We merged the results of the two docking experiments and extracted NFR5–NF–K1 complexes, in which NF was located within the cavity between two receptors. Molecular dynamics simulations indicated two out of six complexes as stable. Regions of mutual polymorphism in the contact zone of one complex overlapped with known NF structural variations produced by R. leguminosarum bv. viciae. A total of 74% of the contact zone of another complex contained mutually polymorphic and conservative areas. Common traits of the obtained two stable structures allowed us to hypothesize the functional role of three-domain structure of plant LysM-RLKs in their heteromers.

  3. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo.

    Science.gov (United States)

    Zubradt, Meghan; Gupta, Paromita; Persad, Sitara; Lambowitz, Alan M; Weissman, Jonathan S; Rouskin, Silvi

    2017-01-01

    Coupling of structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structure studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduce biases and necessitate population-average assessments of RNA structure. Here we present dimethyl sulfate (DMS) mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase. DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low-abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in noncanonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs with their mature isoforms. These applications illustrate DMS-MaPseq's capacity to dramatically expand in vivo analysis of RNA structure.

  4. Primary structure of the human follistatin precursor and its genomic organization

    International Nuclear Information System (INIS)

    Shimasaki, Shunichi; Koga, Makoto; Esch, F.

    1988-01-01

    Follistatin is a single-chain gonadal protein that specifically inhibits follicle-stimulating hormone release. By use of the recently characterized porcine follistatin cDNA as a probe to screen a human testis cDNA library and a genomic library, the structure of the complete human follistatin precursor as well as its genomic organization have been determined. Three of eight cDNA clones that were sequenced predicted a precursor with 344 amino acids, whereas the remaining five cDNA clones encoded a 317 amino acid precursor, resulting from alternative splicing of the precursor mRNA. Mature follistatins contain four contiguous domains that are encoded by precisely separated exons; three of the domains are highly similar to each other, as well as to human epidermal growth factor and human pancreatic secretory trypsin inhibitor. The genomic organization of the human follistatin is similar to that of the human epidermal growth factor gene and thus supports the notion of exon shuffling during evolution

  5. The genomic structure of human BTK, the defective gene in X-linked agammaglobulinemia

    Energy Technology Data Exchange (ETDEWEB)

    Rohrer, J.; Parolini, O. [St. Jude Children`s Research Hospital, Memphis, TN (United States); Conley, M.E. [St. Jude Children`s Research Hospital, Memphis, TN (United States)]|[Univ. of Tennessee College of Medicine, Memphis, TN (United States); Belmont, J.W. [Baylor College of Medicine, Houston, TX (United States)

    1994-12-31

    It has recently been demonstrated that mutations in the gene for Bruton`s tyrosine kinase (BTK) are responsible for X-linked agammaglobulinemia. Southern blot analysis and sequencing of cDNA were used to document deletions, insertions, and single base pair substitutions. To facilitate analysis of BTK regulation and to permit the development of assays that could be used to screen genomic DNA for mutations in BTK, the authors determined the genomic organization of this gene. Subcloning of a cosmid and a yeast artificial chromosome showed that BTK is divided into 19 exons spanning 37 kilobases of genomic DNA. Analysis of the region 5{prime} to the first untranslated exon revealed no consensus TATAA or CAAT boxes; however, three retinoic acid binding sites were identified in this region. Comparison of the structure of BTK with that of other nonreceptor tyrosine kinases, including SRC, FES, and CSK, demonstrated a lack of conservation of exon borders. Information obtained in this study will contribute to understanding of the evolution of nonreceptor tyrosine kinases. It will also be useful in diagnostic studies, including carrier detection, and in studies directed towards gene therapy or gene replacement. 29 refs., 2 figs., 2 tabs.

  6. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... of unprecedented high resolution enabling clear detection of subclonal structure and observation of otherwise undetectable mutations. Furthermore, we demonstrate that OSCC show a high degree of inter-patient heterogeneity but a low degree of intra-patient/tumor heterogeneity. However, some OSCC cancers contain...

  7. Update on the Pfam5000 Strategy for Selection of StructuralGenomics Targets

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2005-06-27

    Structural Genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good financial value, and tractable. In 2003, we presented the ''Pfam5000'' strategy, which involves selecting the 5,000 most important families from the Pfam database as sources for targets. In this update, we show that although both the Pfam database and the number of sequenced genomes have increased in size, the expected benefits of the Pfam5000 strategy have not changed substantially. Solving the structures of proteins from the 5,000 largest Pfam families would allow accurate fold assignment for approximately 65 percent of all prokaryotic proteins (covering 54 percent of residues) and 63 percent of eukaryotic proteins (42 percent of residues). Fewer than 2,300 of the largest families on this list remain to be solved, making the project feasible in the next five years given the expected throughput to be achieved in the production phase of the Protein Structure Initiative.

  8. Genomic epidemiology and population structure of Neisseria gonorrhoeae from remote highly endemic Western Australian populations.

    Science.gov (United States)

    Al Suwayyid, Barakat A; Coombs, Geoffrey W; Speers, David J; Pearson, Julie; Wise, Michael J; Kahler, Charlene M

    2018-02-27

    Neisseria gonorrhoeae causes gonorrhoea, the second most commonly notified sexually transmitted infection in Australia. One of the highest notification rates of gonorrhoea is found in the remote regions of Western Australia (WA). Unlike isolates from the major Australian population centres, the remote community isolates have low rates of antimicrobial resistance (AMR). Population structure and whole-genome comparison of 59 isolates from the Western Australian N. gonorrhoeae collection were used to investigate relatedness of isolates cultured in the metropolitan and remote areas. Core genome phylogeny, multilocus sequencing typing (MLST), N. gonorrhoeae multi-antigen sequence typing (NG-MAST) and N. gonorrhoeae sequence typing for antimicrobial resistance (NG-STAR) in addition to hierarchical clustering of sequences were used to characterize the isolates. Population structure analysis of the 59 isolates together with 72 isolates from an international collection, revealed six population groups suggesting that N. gonorrhoeae is a weakly clonal species. Two distinct population groups, Aus1 and Aus2, represented 63% of WA isolates and were mostly composed of the remote community isolates that carried no chromosomal AMR genotypes. In contrast, the Western Australian metropolitan isolates were frequently multi-drug resistant and belonged to population groups found in the international database, suggesting international transmission of the isolates. Our study suggests that the population structure of N. gonorrhoeae is distinct between the communities in remote and metropolitan WA. Given the high rate of AMR in metropolitan regions, ongoing surveillance is essential to ensure the enduring efficacy of the empiric gonorrhoea treatment in remote WA.

  9. Solution structure of Atg8 reveals conformational polymorphism of the N-terminal domain

    Energy Technology Data Exchange (ETDEWEB)

    Schwarten, Melanie, E-mail: m.schwarten@fz-juelich.de [Institut fuer Strukturbiologie und Biophysik, ISB-3, Forschungszentrum Juelich, 52425 Juelich (Germany); Institut fuer Physikalische Biologie und BMFZ, Heinrich-Heine-Universitaet Duesseldorf, 40225 Duesseldorf (Germany); Stoldt, Matthias, E-mail: m.stoldt@fz-juelich.de [Institut fuer Strukturbiologie und Biophysik, ISB-3, Forschungszentrum Juelich, 52425 Juelich (Germany); Institut fuer Physikalische Biologie und BMFZ, Heinrich-Heine-Universitaet Duesseldorf, 40225 Duesseldorf (Germany); Mohrlueder, Jeannine, E-mail: j.mohrlueder@fz-juelich.de [Institut fuer Strukturbiologie und Biophysik, ISB-3, Forschungszentrum Juelich, 52425 Juelich (Germany); Willbold, Dieter, E-mail: dieter.willbold@uni-duesseldorf.de [Institut fuer Strukturbiologie und Biophysik, ISB-3, Forschungszentrum Juelich, 52425 Juelich (Germany); Institut fuer Physikalische Biologie und BMFZ, Heinrich-Heine-Universitaet Duesseldorf, 40225 Duesseldorf (Germany)

    2010-05-07

    During autophagy a crescent shaped like membrane is formed, which engulfs the material that is to be degraded. This membrane grows further until its edges fuse to form the double membrane covered autophagosome. Atg8 is a protein, which is required for this initial step of autophagy. Therefore, a multistage conjugation process of newly synthesized Atg8 to phosphatidylethanolamine is of critical importance. Here we present the high resolution structure of unprocessed Atg8 determined by nuclear magnetic resonance spectroscopy. Its C-terminal subdomain shows a well-defined ubiquitin-like fold with slightly elevated mobility in the pico- to nanosecond timescale as determined by heteronuclear NOE data. In comparison to unprocessed Atg8, cleaved Atg8{sup G116} shows a decreased mobility behaviour. The N-terminal domain adopts different conformations within the micro- to millisecond timescale. The possible biological relevance of the differences in dynamic behaviours between both subdomains as well as between the cleaved and uncleaved forms is discussed.

  10. Genetic structure and polymorphism analysis of Xinjiang Hui ethnic minority based on 21 STRs.

    Science.gov (United States)

    Lan, Qiong; Chen, Jiangang; Guo, Yuxin; Xie, Tong; Fang, Yating; Jin, Xiaoye; Cui, Wei; Zhou, Yongsong; Zhu, Bofeng

    2018-04-01

    In the present study, we calculated the allelic frequencies and forensic descriptive parameters of Hui ethnic minority on the basis of 21 short tandem repeat (STR) loci aiming at understanding population structure better and enriching population genetic database. Bloodstain samples of 506 unrelated healthy Hui individuals in Xinjiang Uygur Autonomous Region were collected. Altogether 268 alleles were observed and the allelic frequencies ranged from 0.0010 to 0.5306. The combined power of discrimination and the cumulative probability of exclusion of the 21 STR loci in Hui ethnic minority were 0.9999999999999999999999998697 and 0.9999999968, respectively. Population data obtained manifested that the panel of 21 STR loci could provide robust genetic information for individual identification and paternity testing involved in forensic applications for Huis of Xinjiang Region. Furthermore, the present results of interpopulation differentiations, phylogenetic trees and principal component analysis which were conducted based on the overlapping 16 STR loci revealed that Hui group was genetically close to Xibe ethnic group and Han populations from different regions.

  11. Structural and functional insights of β-glucosidases identified from the genome of Aspergillus fumigatus

    Science.gov (United States)

    Dodda, Subba Reddy; Aich, Aparajita; Sarkar, Nibedita; Jain, Piyush; Jain, Sneha; Mondal, Sudipa; Aikat, Kaustav; Mukhopadhyay, Sudit S.

    2018-03-01

    Thermostable glucose tolerant β-glucosidase from Aspergillus species has attracted worldwide interest for their potentiality in industrial applications and bioethanol production. A strain of Aspergillus fumigatus (AfNITDGPKA3) identified by our laboratory from straw retting ground showed higher cellulase activity, specifically the β-glucosidase activity, compared to other contemporary strains. Though A. fumigatus has been known for high cellulase activity, detailed identification and characterization of the cellulase genes from their genome is yet to be done. In this work we have been analyzed the cellulase genes from the genome sequence database of Aspergillus fumigatus (Af293). Genome analysis suggests two cellobiohydrolase, eleven endoglucanase and seventeen β-glucosidase genes present. β-Glucosidase genes belong to either Glycohydro1 (GH1 or Bgl1) or Glycohydro3 (GH3 or Bgl3) family. The sequence similarity suggests that Bgl1 and Bgl3 of A. fumagatus are phylogenetically close to those of A. fisheri and A. oryzae. The modelled structure of the Bgl1 predicts the (β/α)8 barrel type structure with deep and narrow active site, whereas, Bgl3 shows the (α/β)8 barrel and (α/β)6 sandwich structure with shallow and open active site. Docking results suggest that amino acids Glu544, Glu466, Trp408,Trp567,Tyr44,Tyr222,Tyr770,Asp844,Asp537,Asn212,Asn217 of Bgl3 and Asp224,Asn242,Glu440, Glu445, Tyr367, Tyr365,Thr994,Trp435,Trp446 of Bgl1 are involved in the hydrolysis. Binding affinity analyses suggest that Bgl3 and Bgl1 enzymes are more active on the substrates like 4-methylumbelliferyl glycoside (MUG) and p-nitrophenyl-β-D-1, 4-glucopyranoside (pNPG) than on cellobiose. Further docking with glucose suggests that Bgl1 is more glucose tolerant than Bgl3. Analysis of the Aspergillus fumigatus genome may help to identify a β-glucosidase enzyme with better property and the structural information may help to develop an engineered recombinant enzyme.

  12. Combined array-comparative genomic hybridization and single-nucleotide polymorphism-loss of heterozygosity analysis reveals complex changes and multiple forms of chromosomal instability in colorectal cancers

    DEFF Research Database (Denmark)

    Gaasenbeek, Michelle; Howarth, Kimberley; Rowan, Andrew J

    2006-01-01

    (CGH) for copy number changes and single-copy number polymorphism (SNP) microarrays for allelic loss (LOH). Many array-based CGH changes were not found by LOH because they did not cause true reduction-to-homozygosity. Conversely, many regions of SNP-LOH occurred in the absence of copy number change...

  13. Genomic profiling of thousands of candidate polymorphisms predicts risk of relapse in 778 Danish and German childhood acute lymphoblastic leukemia patients

    DEFF Research Database (Denmark)

    Wesolowska, Agata; Borst, L.; Dalgaard, Marlene Danner

    2015-01-01

    Childhood acute lymphoblastic leukemia survival approaches 90%. New strategies are needed to identify the 10–15% who evade cure. We applied targeted, sequencing-based genotyping of 25 000 to 34 000 preselected potentially clinically relevant singlenucleotide polymorphisms (SNPs) to identify host...

  14. A forest-based feature screening approach for large-scale genome data with complex structures.

    Science.gov (United States)

    Wang, Gang; Fu, Guifang; Corcoran, Christopher

    2015-12-23

    Genome-wide association studies (GWAS) interrogate large-scale whole genome to characterize the complex genetic architecture for biomedical traits. When the number of SNPs dramatically increases to half million but the sample size is still limited to thousands, the traditional p-value based statistical approaches suffer from unprecedented limitations. Feature screening has proved to be an effective and powerful approach to handle ultrahigh dimensional data statistically, yet it has not received much attention in GWAS. Feature screening reduces the feature space from millions to hundreds by removing non-informative noise. However, the univariate measures used to rank features are mainly based on individual effect without considering the mutual interactions with other features. In this article, we explore the performance of a random forest (RF) based feature screening procedure to emphasize the SNPs that have complex effects for a continuous phenotype. Both simulation and real data analysis are conducted to examine the power of the forest-based feature screening. We compare it with five other popular feature screening approaches via simulation and conclude that RF can serve as a decent feature screening tool to accommodate complex genetic effects such as nonlinear, interactive, correlative, and joint effects. Unlike the traditional p-value based Manhattan plot, we use the Permutation Variable Importance Measure (PVIM) to display the relative significance and believe that it will provide as much useful information as the traditional plot. Most complex traits are found to be regulated by epistatic and polygenic variants. The forest-based feature screening is proven to be an efficient, easily implemented, and accurate approach to cope whole genome data with complex structures. Our explorations should add to a growing body of enlargement of feature screening better serving the demands of contemporary genome data.

  15. Mitochondrial Genome Sequences and Structures Aid in the Resolution of Piroplasmida phylogeny

    Science.gov (United States)

    Marr, Henry S.; Tarigo, Jaime L.; Cohn, Leah A.; Bird, David M.; Scholl, Elizabeth H.; Levy, Michael G.; Wiegmann, Brian M.; Birkenheuer, Adam J.

    2016-01-01

    The taxonomy of the order Piroplasmida, which includes a number of clinically and economically relevant organisms, is a hotly debated topic amongst parasitologists. Three genera (Babesia, Theileria, and Cytauxzoon) are recognized based on parasite life cycle characteristics, but molecular phylogenetic analyses of 18S sequences have suggested the presence of five or more distinct Piroplasmida lineages. Despite these important advancements, a few studies have been unable to define the taxonomic relationships of some organisms (e.g. C. felis and T. equi) with respect to other Piroplasmida. Additional evidence from mitochondrial genome sequences and synteny should aid in the inference of Piroplasmida phylogeny and resolution of taxonomic uncertainties. In this study, we have amplified, sequenced, and annotated seven previously uncharacterized mitochondrial genomes (Babesia canis, Babesia vogeli, Babesia rossi, Babesia sp. Coco, Babesia conradae, Babesia microti-like sp., and Cytauxzoon felis) and identified additional ribosomal fragments in ten previously characterized mitochondrial genomes. Phylogenetic analysis of concatenated mitochondrial and 18S sequences as well as cox1 amino acid sequence identified five distinct Piroplasmida groups, each of which possesses a unique mitochondrial genome structure. Specifically, our results confirm the existence of four previously identified clades (B. microti group, Babesia sensu stricto, Theileria equi, and a Babesia sensu latu group that includes B. conradae) while supporting the integration of Theileria and Cytauxzoon species into a single fifth taxon. Although known biological characteristics of Piroplasmida corroborate the proposed phylogeny, more investigation into parasite life cycles is warranted to further understand the evolution of the Piroplasmida. Our results provide an evolutionary framework for comparative biology of these important animal and human pathogens and help focus renewed efforts toward understanding the

  16. Mitochondrial Genome Sequences and Structures Aid in the Resolution of Piroplasmida phylogeny.

    Directory of Open Access Journals (Sweden)

    Megan E Schreeg

    Full Text Available The taxonomy of the order Piroplasmida, which includes a number of clinically and economically relevant organisms, is a hotly debated topic amongst parasitologists. Three genera (Babesia, Theileria, and Cytauxzoon are recognized based on parasite life cycle characteristics, but molecular phylogenetic analyses of 18S sequences have suggested the presence of five or more distinct Piroplasmida lineages. Despite these important advancements, a few studies have been unable to define the taxonomic relationships of some organisms (e.g. C. felis and T. equi with respect to other Piroplasmida. Additional evidence from mitochondrial genome sequences and synteny should aid in the inference of Piroplasmida phylogeny and resolution of taxonomic uncertainties. In this study, we have amplified, sequenced, and annotated seven previously uncharacterized mitochondrial genomes (Babesia canis, Babesia vogeli, Babesia rossi, Babesia sp. Coco, Babesia conradae, Babesia microti-like sp., and Cytauxzoon felis and identified additional ribosomal fragments in ten previously characterized mitochondrial genomes. Phylogenetic analysis of concatenated mitochondrial and 18S sequences as well as cox1 amino acid sequence identified five distinct Piroplasmida groups, each of which possesses a unique mitochondrial genome structure. Specifically, our results confirm the existence of four previously identified clades (B. microti group, Babesia sensu stricto, Theileria equi, and a Babesia sensu latu group that includes B. conradae while supporting the integration of Theileria and Cytauxzoon species into a single fifth taxon. Although known biological characteristics of Piroplasmida corroborate the proposed phylogeny, more investigation into parasite life cycles is warranted to further understand the evolution of the Piroplasmida. Our results provide an evolutionary framework for comparative biology of these important animal and human pathogens and help focus renewed efforts toward

  17. The population genomic landscape of human genetic structure, admixture history and local adaptation in Peninsular Malaysia.

    Science.gov (United States)

    Deng, Lian; Hoh, Boon Peng; Lu, Dongsheng; Fu, Ruiqing; Phipps, Maude E; Li, Shilin; Nur-Shafawati, Ab Rajab; Hatin, Wan Isa; Ismail, Endom; Mokhtar, Siti Shuhada; Jin, Li; Zilfalil, Bin Alwi; Marshall, Christian R; Scherer, Stephen W; Al-Mulla, Fahd; Xu, Shuhua

    2014-09-01

    Peninsular Malaysia is a strategic region which might have played an important role in the initial peopling and subsequent human migrations in Asia. However, the genetic diversity and history of human populations--especially indigenous populations--inhabiting this area remain poorly understood. Here, we conducted a genome-wide study using over 900,000 single nucleotide polymorphisms (SNPs) in four major Malaysian ethnic groups (MEGs; Malay, Proto-Malay, Senoi and Negrito), and made comparisons of 17 world-wide populations. Our data revealed that Peninsular Malaysia has greater genetic diversity corresponding to its role as a contact zone of both early and recent human migrations in Asia. However, each single Orang Asli (indigenous) group was less diverse with a smaller effective population size (N(e)) than a European or an East Asian population, indicating a substantial isolation of some duration for these groups. All four MEGs were genetically more similar to Asian populations than to other continental groups, and the divergence time between MEGs and East Asian populations (12,000--6,000 years ago) was also much shorter than that between East Asians and Europeans. Thus, Malaysian Orang Asli groups, despite their significantly different features, may share a common origin with the other Asian groups. Nevertheless, we identified traces of recent gene flow from non-Asians to MEGs. Finally, natural selection signatures were detected in a batch of genes associated with immune response, human height, skin pigmentation, hair and facial morphology and blood pressure in MEGs. Notable examples include SYN3 which is associated with human height in all Orang Asli groups, a height-related gene (PNPT1) and two blood pressure-related genes (CDH13 and PAX5) in Negritos. We conclude that a long isolation period, subsequent gene flow and local adaptations have jointly shaped the genetic architectures of MEGs, and this study provides insight into the peopling and human migration

  18. Target Selection and Deselection at the Berkeley StructuralGenomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E.

    2005-03-22

    At the Berkeley Structural Genomics Center (BSGC), our goalis to obtain a near-complete structural complement of proteins in theminimal organisms Mycoplasma genitalium and M. pneumoniae, two closelyrelated pathogens. Current targets for structure determination have beenselected in six major stages, starting with those predicted to be mosttractable to high throughput study and likely to yield new structuralinformation. We report on the process used to select these proteins, aswell as our target deselection procedure. Target deselection reducesexperimental effort by eliminating targets similar to those recentlysolved by the structural biology community or other centers. We measurethe impact of the 69 structures solved at the BSGC as of July 2004 onstructure prediction coverage of the M. pneumoniae and M. genitaliumproteomes. The number of Mycoplasma proteins for which thefold couldfirst be reliably assigned based on structures solved at the BSGC (24 M.pneumoniae and 21 M. genitalium) is approximately 25 percent of the totalresulting from work at all structural genomics centers and the worldwidestructural biology community (94 M. pneumoniae and 86M. genitalium)during the same period. As the number of structures contributed by theBSGC during that period is less than 1 percent of the total worldwideoutput, the benefits of a focused target selection strategy are apparent.If the structures of all current targets were solved, the percentage ofM. pneumoniae proteins for which folds could be reliably assigned wouldincrease from approximately 57 percent (391 of 687) at present to around80 percent (550 of 687), and the percentage of the proteome that could beaccurately modeled would increase from around 37 percent (254 of 687) toabout 64 percent (438 of 687). In M. genitalium, the percentage of theproteome that could be structurally annotated based on structures of ourremaining targets would rise from 72 percent (348 of 486) to around 76percent (371 of 486), with the

  19. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  20. Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

    DEFF Research Database (Denmark)

    Zhan, Bujie; Fadista, João; Thomsen, Bo

    2011-01-01

    sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were...... of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation...... in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found...

  1. Structure, proteome and genome of Sinorhizobium meliloti phage ΦM5: A virus with LUZ24-like morphology and a highly mosaic genome.

    Science.gov (United States)

    Johnson, Matthew C; Sena-Velez, Marta; Washburn, Brian K; Platt, Georgia N; Lu, Stephen; Brewer, Tess E; Lynn, Jason S; Stroupe, M Elizabeth; Jones, Kathryn M

    2017-12-01

    Bacteriophages of nitrogen-fixing rhizobial bacteria are revealing a wealth of novel structures, diverse enzyme combinations and genomic features. Here we report the cryo-EM structure of the phage capsid at 4.9-5.7Å-resolution, the phage particle proteome, and the genome of the Sinorhizobium meliloti-infecting Podovirus ΦM5. This is the first structure of a phage with a capsid and capsid-associated structural proteins related to those of the LUZ24-like viruses that infect Pseudomonas aeruginosa. Like many other Podoviruses, ΦM5 is a T=7 icosahedron with a smooth capsid and short, relatively featureless tail. Nonetheless, this group is phylogenetically quite distinct from Podoviruses of the well-characterized T7, P22, and epsilon 15 supergroups. Structurally, a distinct bridge of density that appears unique to ΦM5 reaches down the body of the coat protein to the extended loop that interacts with the next monomer in a hexamer, perhaps stabilizing the mature capsid. Further, the predicted tail fibers of ΦM5 are quite different from those of enteric bacteria phages, but have domains in common with other rhizophages. Genomically, ΦM5 is highly mosaic. The ΦM5 genome is 44,005bp with 357bp direct terminal repeats (DTRs) and 58 unique ORFs. Surprisingly, the capsid structural module, the tail module, the DNA-packaging terminase, the DNA replication module and the integrase each appear to be from a different lineage. One of the most unusual features of ΦM5 is its terminase whose large subunit is quite different from previously-described short-DTR-generating packaging machines and does not fit into any of the established phylogenetic groups. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates.

    Directory of Open Access Journals (Sweden)

    Bo Yuan

    2015-12-01

    Full Text Available Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100 is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases-about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual's susceptibility to acquiring disease-associated alleles.

  3. Structures of mono-unsaturated triacylglycerols. IV. The highest melting β'-2 polymorphs of trans-mono-unsaturated triacylglycerols and related saturated TAGs and their polymorphic stability

    NARCIS (Netherlands)

    van Mechelen, J.B.; Peschar, R.; Schenk, H.

    2008-01-01

    The β1'-2 crystal structures of a series of mixed-chain saturated and trans-mono-unsaturated triacylglycerols containing palmitoyl, stearoyl and elaidoyl acyl chains have been solved from high-resolution powder diffraction data, from synchrotron as well as laboratory X-ray sources. The structures

  4. Modeling structure of G protein-coupled receptors in huan genome

    KAUST Repository

    Zhang, Yang

    2016-01-26

    G protein-coupled receptors (or GPCRs) are integral transmembrane proteins responsible to various cellular signal transductions. Human GPCR proteins are encoded by 5% of human genes but account for the targets of 40% of the FDA approved drugs. Due to difficulties in crystallization, experimental structure determination remains extremely difficult for human GPCRs, which have been a major barrier in modern structure-based drug discovery. We proposed a new hybrid protocol, GPCR-I-TASSER, to construct GPCR structure models by integrating experimental mutagenesis data with ab initio transmembrane-helix assembly simulations, assisted by the predicted transmembrane-helix interaction networks. The method was tested in recent community-wide GPCRDock experiments and constructed models with a root mean square deviation 1.26 Å for Dopamine-3 and 2.08 Å for Chemokine-4 receptors in the transmembrane domain regions, which were significantly closer to the native than the best templates available in the PDB. GPCR-I-TASSER has been applied to model all 1,026 putative GPCRs in the human genome, where 923 are found to have correct folds based on the confidence score analysis and mutagenesis data comparison. The successfully modeled GPCRs contain many pharmaceutically important families that do not have previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin and Neuropeptide Y receptors. All the human GPCR models have been made publicly available through the GPCR-HGmod database at http://zhanglab.ccmb.med.umich.edu/GPCR-HGmod/ The results demonstrate new progress on genome-wide structure modeling of transmembrane proteins which should bring useful impact on the effort of GPCR-targeted drug discovery.

  5. Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models.

    Science.gov (United States)

    Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo

    2015-09-01

    Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to "correct" for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP. Copyright © 2015 by the Genetics Society of America.

  6. Insertion sequence element single nucleotide polymorphism typing provides insights into the population structure and evolution of Mycobacterium ulcerans across Africa.

    Science.gov (United States)

    Vandelannoote, Koen; Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C

    2014-02-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the "pan-African clade" were found to be widespread throughout Africa, while the ISE-SNP types of the "Gabonese/Cameroonian clade" were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types.

  7. ViVar: a comprehensive platform for the analysis and visualization of structural genomic variation.

    Directory of Open Access Journals (Sweden)

    Tom Sante

    Full Text Available Structural genomic variations play an important role in human disease and phenotypic diversity. With the rise of high-throughput sequencing tools, mate-pair/paired-end/single-read sequencing has become an important technique for the detection and exploration of structural variation. Several analysis tools exist to handle different parts and aspects of such sequencing based structural variation analyses pipelines. A comprehensive analysis platform to handle all steps, from processing the sequencing data, to the discovery and visualization of structural variants, is missing. The ViVar platform is built to handle the discovery of structural variants, from Depth Of Coverage analysis, aberrant read pair clustering to split read analysis. ViVar provides you with powerful visualization options, enables easy reporting of results and better usability and data management. The platform facilitates the processing, analysis and visualization, of structural variation based on massive parallel sequencing data, enabling the rapid identification of disease loci or genes. ViVar allows you to scale your analysis with your work load over multiple (cloud servers, has user access control to keep