WorldWideScience

Sample records for analysis incorporating genomic

  1. An approach to incorporate linkage disequilibrium structure into genomic association analysis

    Institute of Scientific and Technical Information of China (English)

    Fengyu Zhang; Diane Wagener

    2008-01-01

    In this study, we propose to use the principal component analysis (PCA) and regression model to incorporate linkage disequilibrium (LD) in genomic association data analysis. To accommodate LD in genomic data and reduce multiple testing, we suggest performing PCA and extracting the PCA score to capture the variation of genomic data, after which regression analysis is used to assess the association of the disease with the principal component score. An empirical analysis result shows that both genotype-basod correlation matrix and haplotype-based LD matrix can produce similar results for PCA. Principal component score seems to be more powerful in detecting genetic association because the principal component score is quantitatively measured and may be able to capture the effect of multiple loci.

  2. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  3. Incorporating Genetics and Genomics into Nursing Practice: A Demonstration

    OpenAIRE

    Susan D. Fisher; Mary Lou Clawson; Elizabeth L. Pestka; Laura M. Junglen

    2008-01-01

    This article describes how nurses who previously had not focused on genetic and genomic care realized this knowledge was needed to provide optimal care to their patients and evolved their practice to include essential nursing genetic and genomic competencies. It describes the strategies used to gain the genetic and genomic competencies needed to care for hereditary hemorrhagic telangiectasia (HHT) patients, illustrates genetic and genomic competencies in practice, and delineates nursing’s con...

  4. Incorporating Experience Curves in Appliance Standards Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Garbesi, Karina; Chan, Peter; Greenblatt, Jeffery; Kantner, Colleen; Lekov, Alex; Meyers, Stephen; Rosenquist, Gregory; Buskirk, Robert Van; Yang, Hung-Chia; Desroches, Louis-Benoit

    2011-10-31

    The technical analyses in support of U.S. energy conservation standards for residential appliances and commercial equipment have typically assumed that manufacturing costs and retail prices remain constant during the projected 30-year analysis period. There is, however, considerable evidence that this assumption does not reflect real market prices. Costs and prices generally fall in relation to cumulative production, a phenomenon known as experience and modeled by a fairly robust empirical experience curve. Using price data from the Bureau of Labor Statistics, and shipment data obtained as part of the standards analysis process, we present U.S. experience curves for room air conditioners, clothes dryers, central air conditioners, furnaces, and refrigerators and freezers. These allow us to develop more representative appliance price projections than the assumption-based approach of constant prices. These experience curves were incorporated into recent energy conservation standards for these products. The impact on the national modeling can be significant, often increasing the net present value of potential standard levels in the analysis. In some cases a previously cost-negative potential standard level demonstrates a benefit when incorporating experience. These results imply that past energy conservation standards analyses may have undervalued the economic benefits of potential standard levels.

  5. Incorporating experience curves in appliance standards analysis

    International Nuclear Information System (INIS)

    There exists considerable evidence that manufacturing costs and consumer prices of residential appliances have decreased in real terms over the last several decades. This phenomenon is generally attributable to manufacturing efficiency gained with cumulative experience producing a certain good, and is modeled by an empirical experience curve. The technical analyses conducted in support of U.S. energy conservation standards for residential appliances and commercial equipment have, until recently, assumed that manufacturing costs and retail prices remain constant during the projected 30-year analysis period. This assumption does not reflect real market price dynamics. Using price data from the Bureau of Labor Statistics, we present U.S. experience curves for room air conditioners, clothes dryers, central air conditioners, furnaces, and refrigerators and freezers. These experience curves were incorporated into recent energy conservation standards analyses for these products. Including experience curves increases the national consumer net present value of potential standard levels. In some cases a potential standard level exhibits a net benefit when considering experience, whereas without experience it exhibits a net cost. These results highlight the importance of modeling more representative market prices. - Highlights: ► Past appliance standards analyses have assumed constant equipment prices. ► There is considerable evidence of consistent real price declines. ► We incorporate experience curves for several large appliances into the analysis. ► The revised analyses demonstrate larger net present values of potential standards. ► The results imply that past standards analyses may have undervalued benefits.

  6. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2003-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  7. MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

    Science.gov (United States)

    Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

    2015-01-01

    The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. PMID:25398900

  8. GWAMA: software for genome-wide association meta-analysis

    Directory of Open Access Journals (Sweden)

    Mägi Reedik

    2010-05-01

    Full Text Available Abstract Background Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. Results We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. Conclusions The GWAMA (Genome-Wide Association Meta-Analysis software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.

  9. Paired de Bruijn Graphs: A Novel Approach for Incorporating Mate Pair Information into Genome Assemblers

    OpenAIRE

    Medvedev, Paul; Pham, Son; Chaisson, Mark; Tesler, Glenn; Pevzner, Pavel

    2011-01-01

    The recent proliferation of next generation sequencing with short reads has enabled many new experimental opportunities but, at the same time, has raised formidable computational challenges in genome assembly. One of the key advances that has led to an improvement in contig lengths has been mate pairs, which facilitate the assembly of repeating regions. Mate pairs have been algorithmically incorporated into most next generation assemblers as various heuristic post-processing steps to correct ...

  10. Genetic association analysis of complex diseases incorporating intermediate phenotype information.

    Directory of Open Access Journals (Sweden)

    Yafang Li

    Full Text Available Genetic researchers often collect disease related quantitative traits in addition to disease status because they are interested in understanding the pathophysiology of disease processes. In genome-wide association (GWA studies, these quantitative phenotypes may be relevant to disease development and serve as intermediate phenotypes or they could be behavioral or other risk factors that predict disease risk. Statistical tests combining both disease status and quantitative risk factors should be more powerful than case-control studies, as the former incorporates more information about the disease. In this paper, we proposed a modified inverse-variance weighted meta-analysis method to combine disease status and quantitative intermediate phenotype information. The simulation results showed that when an intermediate phenotype was available, the inverse-variance weighted method had more power than did a case-control study of complex diseases, especially in identifying susceptibility loci having minor effects. We further applied this modified meta-analysis to a study of imputed lung cancer genotypes with smoking data in 1154 cases and 1137 matched controls. The most significant SNPs came from the CHRNA3-CHRNA5-CHRNB4 region on chromosome 15q24-25.1, which has been replicated in many other studies. Our results confirm that this CHRNA region is associated with both lung cancer development and smoking behavior. We also detected three significant SNPs--rs1800469, rs1982072, and rs2241714--in the promoter region of the TGFB1 gene on chromosome 19 (p = 1.46×10(-5, 1.18×10(-5, and 6.57×10(-6, respectively. The SNP rs1800469 is reported to be associated with chronic obstructive pulmonary disease and lung cancer in cigarette smokers. The present study is the first GWA study to replicate this result. Signals in the 3q26 region were also identified in the meta-analysis. We demonstrate the intermediate phenotype can potentially enhance the power of complex

  11. Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication

    Directory of Open Access Journals (Sweden)

    Lippold Sebastian

    2011-11-01

    Full Text Available Abstract Background DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region were investigated for larger sample sets. Results In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot (BSP reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73% already existed before the beginning of domestication about 5,000 years ago. Conclusions Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the

  12. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  13. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  14. The Cancer Genome Atlas ovarian cancer analysis

    Science.gov (United States)

    An analysis of genomic changes in ovarian cancer has provided the most comprehensive and integrated view of cancer genes for any cancer type to date. Ovarian serous adenocarcinoma tumors from 500 patients were examined by The Cancer Genome Atlas (TCGA) Re

  15. Structural and functional analysis of rice genome

    Indian Academy of Sciences (India)

    Akhilesh K. Tyagi; Jitendra P. Khurana; Paramjit Khurana; Saurabh Raghuvanshi; Anupama Gaur; Anita Kapur; Vikrant Gupta; Dibyendu Kumar; V. Ravi; Shubha Vij; Parul Khurana; Sulabha Sharma

    2004-04-01

    Rice is an excellent system for plant genomics as it represents a modest size genome of 430 Mb. It feeds more than half the population of the world. Draft sequences of the rice genome, derived by whole-genome shotgun approach at relatively low coverage (4–6 X), were published and the International Rice Genome Sequencing Project (IRGSP) declared high quality (>10 X), genetically anchored, phase 2 level sequence in 2002. In addition, phase 3 level finished sequence of chromosomes 1, 4 and 10 (out of 12 chromosomes of rice) has already been reported by scientists from IRGSP consortium. Various estimates of genes in rice place the number at > 50,000. Already, over 28,000 full-length cDNAs have been sequenced, most of which map to genetically anchored genome sequence. Such information is very useful in revealing novel features of macro- and micro-level synteny of rice genome with other cereals. Microarray analysis is unraveling the identity of rice genes expressing in temporal and spatial manner and should help target candidate genes useful for improving traits of agronomic importance. Simultaneously, functional analysis of rice genome has been initiated by marker-based characterization of useful genes and employing functional knock-outs created by mutation or gene tagging. Integration of this enormous information is expected to catalyze tremendous activity on basic and applied aspects of rice genomics.

  16. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    PaolaCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  17. Incorporating Basic Optical Microscopy in the Instrumental Analysis Laboratory

    Science.gov (United States)

    Flowers, Paul A.

    2011-01-01

    A simple and versatile approach to incorporating basic optical microscopy in the undergraduate instrumental analysis laboratory is described. Attaching a miniature CCD spectrometer to the video port of a standard compound microscope yields a visible microspectrophotometer suitable for student investigations of fundamental spectrometry concepts,…

  18. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    Next generation sequencing (NGS) has revolutionized the field of genomics and its wide range of applications has resulted in the genome-wide analysis of hundreds of species and the development of thousands of computational tools. This thesis represents my work on NGS analysis of four species, Lotus...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... agricultural and biological importance. Its capacity to form symbiotic relationships with rhizobia and microrrhizal fungi has fascinated researchers for years. Lotus has a small genome of approximately 470 Mb and a short life cycle of 2 to 3 months, which has made Lotus a model legume plant for many molecular...

  19. Whole genome analysis of a Vietnamese trio

    Indian Academy of Sciences (India)

    Dang Thanh Hai; Nguyen Dai Thanh; Pham Thi Minh Trang; Le Si Quang; Phan Thi Thu Hang; Dang Cao Cuong; Hoang Kim Phuc; Nguyen Huu Duc; Do Duc Dong; Bui Quang Minh; Pham Bao Son; Le Sy Vinh

    2015-03-01

    We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.

  20. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  1. Mathematical Analysis of Genomic Evolution

    Directory of Open Access Journals (Sweden)

    Cedric Green

    2011-01-01

    Full Text Available Changes in nucleotide sequences, or mutations, accumulate from generation to generation in the genomes of all living organisms. The mutations can be advantageous, deleterious, or neutral. The goal of this project is to determine the amount of advantageous mutations it takes to get human (Homo sapiens DNA from the DNA of genetically distinct organisms. We do this by collecting the genomic data of such organisms, and estimating the amount of mutations it takes to transform yeast (Saccharomyces cerevisiae DNA to the DNA of a human. We calculate the typical number of mutations occurring annually through the organism's average life span and the average mutation rate. This allows us to determine the total number of mutations as well as the probability of advantageous mutations. Not surprisingly, this probability proves to be fairly small. A more precise estimate can be determined by accounting for the differences in the chromosomal structure and phenomena like horizontal gene transfer.

  2. A Distance Measure for Genome Phylogenetic Analysis

    Science.gov (United States)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  3. Genomic analysis of Fusarium verticillioides.

    Science.gov (United States)

    Brown, D W; Butchko, R A E; Proctor, R H

    2008-09-01

    Fusarium verticillioides (teleomorph Gibberella moniliformis) can be either an endophyte of maize, causing no visible disease, or a pathogen-causing disease of ears, stalks, roots and seedlings. At any stage, this fungus can synthesize fumonisins, a family of mycotoxins structurally similar to the sphingolipid sphinganine. Ingestion of fumonisin-contaminated maize has been associated with a number of animal diseases, including cancer in rodents, and exposure has been correlated with human oesophageal cancer in some regions of the world, and some evidence suggests that fumonisins are a risk factor for neural tube defects. A primary goal of the authors' laboratory is to eliminate fumonisin contamination of maize and maize products. Understanding how and why these toxins are made and the F. verticillioides-maize disease process will allow one to develop novel strategies to limit tissue destruction (rot) and fumonisin production. To meet this goal, genomic sequence data, expressed sequence tags (ESTs) and microarrays are being used to identify F. verticillioides genes involved in the biosynthesis of toxins and plant pathogenesis. This paper describes the current status of F. verticillioides genomic resources and three approaches being used to mine microarray data from a wild-type strain cultured in liquid fumonisin production medium for 12, 24, 48, 72, 96 and 120h. Taken together, these approaches demonstrate the power of microarray technology to provide information on different biological processes. PMID:19238625

  4. Incorporating prior knowledge to facilitate discoveries in a genome-wide association study on age-related macular degeneration

    Directory of Open Access Journals (Sweden)

    Lee Wen-Chung

    2010-01-01

    Full Text Available Abstract Background Substantial genotyping data produced by current high-throughput technologies have brought opportunities and difficulties. With the number of single-nucleotide polymorphisms (SNPs going into millions comes the harsh challenge of multiple-testing adjustment. However, even with the false discovery rate (FDR control approach, a genome-wide association study (GWAS may still fall short of discovering any true positive gene, particularly when it has a relatively small sample size. Findings To counteract such a harsh multiple-testing penalty, in this report, we incorporate findings from previous linkage and association studies to re-analyze a GWAS on age-related macular degeneration. While previous Bonferroni correction and the traditional FDR approach detected only one significant SNP (rs380390, here we have been able to detect seven significant SNPs with an easy-to-implement prioritized subset analysis (PSA with the overall FDR controlled at 0.05. These include SNPs within three genes: CFH, CFHR4, and SGCD. Conclusions Based on the success of this example, we advocate using the simple method of PSA to facilitate discoveries in future GWASs.

  5. Incorporation of advanced accident analysis methodology into safety analysis reports

    International Nuclear Information System (INIS)

    The IAEA Safety Guide on Safety Assessment and Verification defines that the aim of the safety analysis should be by means of appropriate analytical tools to establish and confirm the design basis for the items important to safety, and to ensure that the overall plant design is capable of meeting the prescribed and acceptable limits for radiation doses and releases for each plant condition category. Practical guidance on how to perform accident analyses of nuclear power plants (NPPs) is provided by the IAEA Safety Report on Accident Analysis for Nuclear Power Plants. The safety analyses are performed both in the form of deterministic and probabilistic analyses for NPPs. It is customary to refer to deterministic safety analyses as accident analyses. This report discusses the aspects of using the advanced accident analysis methods to carry out accident analyses in order to introduce them into the Safety Analysis Reports (SARs). In relation to the SAR, purposes of deterministic safety analysis can be further specified as (1) to demonstrate compliance with specific regulatory acceptance criteria; (2) to complement other analyses and evaluations in defining a complete set of design and operating requirements; (3) to identify and quantify limiting safety system set points and limiting conditions for operation to be used in the NPP limits and conditions; (4) to justify appropriateness of the technical solutions employed in the fulfillment of predetermined safety requirements. The essential parts of accident analyses are performed by applying sophisticated computer code packages, which have been specifically developed for this purpose. These code packages include mainly thermal-hydraulic system codes and reactor dynamics codes meant for the transient and accident analyses. There are also specific codes such as those for the containment thermal-hydraulics, for the radiological consequences and for severe accident analyses. In some cases, codes of a more general nature such

  6. AGAPE (Automated Genome Analysis PipelinE for pan-genome analysis of Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Giltae Song

    Full Text Available The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  7. Joint Analysis of Functional Genomic Data and Genome-wide Association Studies of 18 Human Traits

    OpenAIRE

    Pickrell, Joseph K.

    2014-01-01

    Annotations of gene structures and regulatory elements can inform genome-wide association studies (GWASs). However, choosing the relevant annotations for interpreting an association study of a given trait remains challenging. I describe a statistical model that uses association statistics computed across the genome to identify classes of genomic elements that are enriched with or depleted of loci influencing a trait. The model naturally incorporates multiple types of annotations. I applied th...

  8. Comparative genome analysis of Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  9. e-Fungi: a data resource for comparative analysis of fungal genomes

    Directory of Open Access Journals (Sweden)

    Hubbard Simon J

    2007-11-01

    Full Text Available Abstract Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the

  10. User-level sentiment analysis incorporating social networks

    CERN Document Server

    Tan, Chenhao; Tang, Jie; Jiang, Long; Zhou, Ming; Li, Ping

    2011-01-01

    We show that information about social relationships can be used to improve user-level sentiment analysis. The main motivation behind our approach is that users that are somehow "connected" may be more likely to hold similar opinions; therefore, relationship information can complement what we can extract about a user's viewpoints from their utterances. Employing Twitter as a source for our experimental data, and working within a semi-supervised framework, we propose models that are induced either from the Twitter follower/followee network or from the network in Twitter formed by users referring to each other using "@" mentions. Our transductive learning results reveal that incorporating social-network information can indeed lead to statistically significant sentiment-classification improvements over the performance of an approach based on Support Vector Machines having access only to textual features.

  11. Physiological genomics analysis for Alzheimer's disease.

    Science.gov (United States)

    Wiwanitkit, Viroj

    2013-01-01

    Alzheimer's disease is a common kind of dementia. This disorder can be detected in all countries around the world. This neurological disorder affects millions of population and becomes an important concern in modern neurology. There are many researches on the pathogenesis of Alzheimer's disease. Although it has been determined for a long time, there is no clear-cut that this is a case with genetic disorder or not. A physiological genomics is a new application that is useful for track function to genes within the human genome and can be applied for answering the problem of underlying pathobiology of complex diseases. The physiogenomics can be helpful for study of systemic approach on the pathophysiology, and genomics might provide useful information to better understand the pathogenesis of Alzheimer's disease. The present advent in genomics technique makes it possible to trace for the underlying genomics of disease. In this work, physiological genomics analysis for Alzheimer's disease was performed. The standard published technique is used for assessment. According to this work, there are 20 identified physiogenomics relationship on several chromosomes. Considering the results, the HADH2 gene on chromosome X, APBA1 gene on chromosome 9, AGER gene on chromosome 6, GSK3B gene on chromosome 3, CDKHR1 gene on chromosome 17, APPBP1 gene on chromosome 16, APBA2 gene on chromosome 15, GAL gene on chromosome 11, and APLP2 gene on chromosome 11 have the highest physiogenomics score (9.26) while the CASP3 gene on chromosome 4 and the SNCA gene on chromosome 4 have the lowest physiogenomics score (7.44). The results from this study confirm that Alzheimer's disease has a polygenomic origin. PMID:23661967

  12. A factor analysis model for functional genomics

    OpenAIRE

    Shioda Romy; Kustra Rafal; Zhu Mu

    2006-01-01

    Abstract Background Expression array data are used to predict biological functions of uncharacterized genes by comparing their expression profiles to those of characterized genes. While biologically plausible, this is both statistically and computationally challenging. Typical approaches are computationally expensive and ignore correlations among expression profiles and functional categories. Results We propose a factor analysis model (FAM) for functional genomics and give a two-step algorith...

  13. Typical Genomic Framework on Disease Analysis

    OpenAIRE

    J. Stanly Thomas; Dr.N Rajkumar

    2015-01-01

    The challenging and major role of the doctor in human life is to predict as well as diagnose the disease which has got infected in the human body. This typical genomic framework on disease analysis algorithm is designed to store and drive each and every gene characteristics like shape, weight, location and normal growth culture. Whenever the disease report is feed into this data mining algorithm triggers the similarity test built upon the data mining classification rules. A gene is usually co...

  14. [Computational genome analysis of three marine algoviruses].

    Science.gov (United States)

    Stepanova, O A; Boĭko, A L; Shcherbatenko, I S

    2013-01-01

    Computational analysis of genomic sequences of three new marine algoviruses: Tetraselmis viridis virus (TvV-S20 and TvV-SI1 strains) and Dunaliella viridis virus (DvV-SI2 strain) was conducted. Both considerable similarity and essential distinctions between studied strains and the most studied marine algoviruses of Phycodnaviridae family were revealed. Our data show that the tested strains are new viruses with the following features: only they were isolated from marine eukaryotic microalgae T. viridis and D. viridis, coding sequences (CDSs) of their genomes are localized mainly on one of the DNA strands and form several clusters with short intergenic spaces; there are considerable variations in genome structure within viruses and their strains; viral genomic DNA has a high GC-content (55.5 - 67.4%); their genes contain no well-known optimal contexts of translation start codones, and the contexts of terminal codons read-through; the vast majority of viral genes and proteins do not have any matches in gene banks. PMID:24479317

  15. Incorporating heifer feed efficiency in the Australian selection index using genomic selection.

    Science.gov (United States)

    Gonzalez-Recio, O; Pryce, J E; Haile-Mariam, M; Hayes, B J

    2014-01-01

    The economic benefit of expanding the Australian Profit Ranking (APR) index to include residual feed intake (RFI) was evaluated using a multitrait selection index. This required the estimation of genetic parameters for RFI and genetic correlations using single nucleotide polymorphism data (genomic) correlations with other traits. Heritabilities of RFI, dry matter intake (DMI), and all the traits in the APR (milk, fat, and protein yields; somatic cell count; fertility; survival; milking speed; and temperament), and genomic correlations between these traits were estimated using a Bayesian framework, using data from 843 growing Holstein heifers with phenotypes for DMI and RFI, and bulls with records for the other traits. Heritability estimates of DMI and RFI were 0.44 and 0.33, respectively, and the genomic correlation between them was 0.03 and nonsignificant. The genomic correlations between the feed-efficiency traits and milk yield traits were also close to zero, ranging between -0.11 and 0.10. Positive genomic correlations were found for DMI with stature (0.16) and with overall type (0.14), suggesting that taller cows eat more as heifers. One issue was that the genomic correlation estimates for RFI with calving interval (ClvI) and with body condition score were both unfavorable (-0.13 and 0.71 respectively), suggesting an antagonism between feed efficiency and fertility. However, because of the relatively small numbers of animals in this study, a large 95% probability interval existed for the genomic correlation between RFI and ClvI (-0.66, 0.36). Given these parameters, and a genetic correlation between heifer and lactating cow RFI of 0.67, inclusion of RFI in the APR index would reduce RFI by 1.76 kg/cow per year. Including RFI in the APR would result in the national Australian Holstein herd consuming 1.73 × 10(6) kg less feed, which is worth 0.55 million Australian dollars (A$) per year and is 3% greater than is currently possible to achieve. Other traits

  16. Analysis of the allohexaploid bread wheat genome (Triticum aestivum) using comparative whole genome shotgun sequencing

    Science.gov (United States)

    The large 17 Gb allopolyploid genome of bread wheat is a major challenge for genome analysis because it is composed of three closely- related and independently maintained genomes, with genes dispersed as small “islands” separated by vast tracts of repetitive DNA. We used a novel comparative genomi...

  17. Mycobacterial species as case-study of comparative genome analysis

    DEFF Research Database (Denmark)

    Zakham, F.; Belayachi, L.; Ussery, David; Akrim, M.; Benjouad, A.; El Aouad, R.; Ennaji, M. M.

    2011-01-01

    evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str...... genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been...

  18. Incorporating concepts of inequality and inequity into health benefits analysis

    Directory of Open Access Journals (Sweden)

    Tuchmann Jessica L

    2006-03-01

    Full Text Available Abstract Background Although environmental policy decisions are often based in part on both risk assessment information and environmental justice concerns, formalized approaches for addressing inequality or inequity when estimating the health benefits of pollution control have been lacking. Inequality indicators that fulfill basic axioms and agree with relevant definitions and concepts in health benefits analysis and environmental justice analysis can allow for quantitative examination of efficiency-equality tradeoffs in pollution control policies. Methods To develop appropriate inequality indicators for health benefits analysis, we provide relevant definitions from the fields of risk assessment and environmental justice and consider the implications. We evaluate axioms proposed in past studies of inequality indicators and develop additional axioms relevant to this context. We survey the literature on previous applications of inequality indicators and evaluate five candidate indicators in reference to our proposed axioms. We present an illustrative pollution control example to determine whether our selected indicators provide interpretable information. Results and Conclusions We conclude that an inequality indicator for health benefits analysis should not decrease when risk is transferred from a low-risk to high-risk person, and that it should decrease when risk is transferred from a high-risk to low-risk person (Pigou-Dalton transfer principle, and that it should be able to have total inequality divided into its constituent parts (subgroup decomposability. We additionally propose that an ideal indicator should avoid value judgments about the relative importance of transfers at different percentiles of the risk distribution, incorporate health risk with evidence about differential susceptibility, include baseline distributions of risk, use appropriate geographic resolution and scope, and consider multiple competing policy alternatives. Given

  19. Incorporating group correlations in genome-wide association studies using smoothed group Lasso

    OpenAIRE

    Liu, Jin; Huang, Jian; Ma, Shuangge; Wang, Kai

    2012-01-01

    In genome-wide association studies, penalization is an important approach for identifying genetic markers associated with disease. Motivated by the fact that there exists natural grouping structure in single nucleotide polymorphisms and, more importantly, such groups are correlated, we propose a new penalization method for group variable selection which can properly accommodate the correlation between adjacent groups. This method is based on a combination of the group Lasso penalty and a quad...

  20. Sensitivity Analysis Case Study: Incorporating Organisational Factors in HRA

    International Nuclear Information System (INIS)

    The JCO event that occurred in September 1999 in Tokai-Mura, Japan was analysed to determine the contribution of latent organisational factors to the event and to determine a process by which these factors could be used to support HRA quantification. In a hybrid analysis conducted by the authors, 3 approaches were combined to support the sensitivity results reported herein. First, a search for the error forcing context, human failure event(s), and unsafe acts associated with the event were determined by application of the ATHEANA HRA search method (NUREG 1624 Rev1) for retrospective analysis. As part of this screening and characterisation approach, error producing conditions (EPC) and violation producing conditions from HEART (Williams in Reason 1998) were examined for insights regarding human performance. Using the IAEA ASCOT (1993) guidelines for culture assessment, the event was analysed for underlying organisational factors that could have contributed to the event. Quantification and sensitivity analysis of key operator decisions and actions, i.e., unsafe acts leading to the criticality failure event employed values contained in THERP (NUREG/CR 1278 1983 and CREAM 1998). The ability to incorporate organisational factors is demonstrated and insights regarding retrospective and prospective analysis are included. Additionally, review of the role of organisational factors in events such as the JCO event arguably has the potential to guide analysts thinking regarding novel and important sequences that should be considered when performing PSA. The human failure event leading to criticality at JCO was the result of at least 6 unsafe acts most if not all of which were organisational in nature. Without the explicit consideration of organisational factors, most contemporary HRAs would have underestimated the risk associated with the event. The HRA methods used to support this analysis - ATHEANA, CREAM, HEART all have allowances that enable the analyst to successfully

  1. Peptidoglycan: a post-genomic analysis

    Directory of Open Access Journals (Sweden)

    Cayrou Caroline

    2012-12-01

    Full Text Available Abstract Background To derive post-genomic, neutral insight into the peptidoglycan (PG distribution among organisms, we mined 1,644 genomes listed in the Carbohydrate-Active Enzymes database for the presence of a minimal 3-gene set that is necessary for PG metabolism. This gene set consists of one gene from the glycosyltransferase family GT28, one from family GT51 and at least one gene belonging to one of five glycoside hydrolase families (GH23, GH73, GH102, GH103 and GH104. Results None of the 103 Viruses or 101 Archaea examined possessed the minimal 3-gene set, but this set was detected in 1/42 of the Eukarya members (Micromonas sp., coding for GT28, GT51 and GH103 and in 1,260/1,398 (90.1% of Bacteria, with a 100% positive predictive value for the presence of PG. Pearson correlation test showed that GT51 family genes were significantly associated with PG with a value of 0.963 and a p value less than 10-3. This result was confirmed by a phylogenetic comparative analysis showing that the GT51-encoding gene was significantly associated with PG with a Pagel’s score of 60 and 51 (percentage of error close to 0%. Phylogenetic analysis indicated that the GT51 gene history comprised eight loss and one gain events, and suggested a dynamic on-going process. Conclusions Genome analysis is a neutral approach to explore prospectively the presence of PG in uncultured, sequenced organisms with high predictive values.

  2. Initial sequencing and comparative analysis of the mouse genome

    Energy Technology Data Exchange (ETDEWEB)

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  3. Typical Genomic Framework on Disease Analysis

    Directory of Open Access Journals (Sweden)

    J. Stanly Thomas

    2015-01-01

    Full Text Available The challenging and major role of the doctor in human life is to predict as well as diagnose the disease which has got infected in the human body. This typical genomic framework on disease analysis algorithm is designed to store and drive each and every gene characteristics like shape, weight, location and normal growth culture. Whenever the disease report is feed into this data mining algorithm triggers the similarity test built upon the data mining classification rules. A gene is usually comprised of hundreds of individual nucleotides arranged in a particular order. There are almost an unlimited number of ways that the nucleotides can be ordered and sequenced to form distinct genes. The algorithm delivers the difference between diseased and healthy status shall guide us to conclude the disease severity, stage and its nature. This powerful Typical Genomic Framework on Disease Analysis (TGFDA algorithm is built to deliver instant result over Very Large Database using density and weight based Clustering Algorithm.

  4. Feasibility of incorporating genomic knowledge into electronic medical records for pharmacogenomic clinical decision support

    Directory of Open Access Journals (Sweden)

    Hoath James I

    2010-10-01

    Full Text Available Abstract In pursuing personalized medicine, pharmacogenomic (PGx knowledge may help guide prescribing drugs based on a person’s genotype. Here we evaluate the feasibility of incorporating PGx knowledge, combined with clinical data, to support clinical decision-making by: 1 analyzing clinically relevant knowledge contained in PGx knowledge resources; 2 evaluating the feasibility of a rule-based framework to support formal representation of clinically relevant knowledge contained in PGx knowledge resources; and, 3 evaluating the ability of an electronic medical record/electronic health record (EMR/EHR to provide computable forms of clinical data needed for PGx clinical decision support. Findings suggest that the PharmGKB is a good source for PGx knowledge to supplement information contained in FDA approved drug labels. Furthermore, we found that with supporting knowledge (e.g. IF age

  5. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    cells are capable of regulating their gene expression, so that each cell can only express a particular set of genes yielding limited numbers of proteins with specialized functions. Therefore a rigid control of differential gene expression is necessary for cellular diversity. On the other hand, aberrant...... gene regulation will disrupt the cell’s fundamental processes, which in turn can cause disease. Hence, understanding gene regulation is essential for deciphering the code of life. Along with the development of high throughput sequencing (HTS) technology and the subsequent large-scale data analysis......, genome-wide assays have increased our understanding of gene regulation significantly. This thesis describes the integration and analysis of HTS data across different important aspects of gene regulation. Gene expression can be regulated at different stages when the genetic information is passed from gene...

  6. Genetic Association Analysis of Complex Diseases Incorporating Intermediate Phenotype Information

    OpenAIRE

    Li, Yafang; Huang, Jian; Amos, Christopher I.

    2012-01-01

    Genetic researchers often collect disease related quantitative traits in addition to disease status because they are interested in understanding the pathophysiology of disease processes. In genome-wide association (GWA) studies, these quantitative phenotypes may be relevant to disease development and serve as intermediate phenotypes or they could be behavioral or other risk factors that predict disease risk. Statistical tests combining both disease status and quantitative risk factors should ...

  7. Genome Data Exploration Using Correspondence Analysis

    Science.gov (United States)

    Tekaia, Fredj

    2016-01-01

    Recent developments of sequencing technologies that allow the production of massive amounts of genomic and genotyping data have highlighted the need for synthetic data representation and pattern recognition methods that can mine and help discovering biologically meaningful knowledge included in such large data sets. Correspondence analysis (CA) is an exploratory descriptive method designed to analyze two-way data tables, including some measure of association between rows and columns. It constructs linear combinations of variables, known as factors. CA has been used for decades to study high-dimensional data, and remarkable inferences from large data tables were obtained by reducing the dimensionality to a few orthogonal factors that correspond to the largest amount of variability in the data. Herein, I review CA and highlight its use by considering examples in handling high-dimensional data that can be constructed from genomic and genetic studies. Examples in amino acid compositions of large sets of species (viruses, phages, yeast, and fungi) as well as an example related to pairwise shared orthologs in a set of yeast and fungal species, as obtained from their proteome comparisons, are considered. For the first time, results show striking segregations between yeasts and fungi as well as between viruses and phages. Distributions obtained from shared orthologs show clusters of yeast and fungal species corresponding to their phylogenetic relationships. A direct comparison with the principal component analysis method is discussed using a recently published example of genotyping data related to newly discovered traces of an ancient hominid that was compared to modern human populations in the search for ancestral similarities. CA offers more detailed results highlighting links between modern humans and the ancient hominid and their characterizations. Compared to the popular principal component analysis method, CA allows easier and more effective interpretation of results

  8. Genome Data Exploration Using Correspondence Analysis.

    Science.gov (United States)

    Tekaia, Fredj

    2016-01-01

    Recent developments of sequencing technologies that allow the production of massive amounts of genomic and genotyping data have highlighted the need for synthetic data representation and pattern recognition methods that can mine and help discovering biologically meaningful knowledge included in such large data sets. Correspondence analysis (CA) is an exploratory descriptive method designed to analyze two-way data tables, including some measure of association between rows and columns. It constructs linear combinations of variables, known as factors. CA has been used for decades to study high-dimensional data, and remarkable inferences from large data tables were obtained by reducing the dimensionality to a few orthogonal factors that correspond to the largest amount of variability in the data. Herein, I review CA and highlight its use by considering examples in handling high-dimensional data that can be constructed from genomic and genetic studies. Examples in amino acid compositions of large sets of species (viruses, phages, yeast, and fungi) as well as an example related to pairwise shared orthologs in a set of yeast and fungal species, as obtained from their proteome comparisons, are considered. For the first time, results show striking segregations between yeasts and fungi as well as between viruses and phages. Distributions obtained from shared orthologs show clusters of yeast and fungal species corresponding to their phylogenetic relationships. A direct comparison with the principal component analysis method is discussed using a recently published example of genotyping data related to newly discovered traces of an ancient hominid that was compared to modern human populations in the search for ancestral similarities. CA offers more detailed results highlighting links between modern humans and the ancient hominid and their characterizations. Compared to the popular principal component analysis method, CA allows easier and more effective interpretation of results

  9. Incorporating genomics into breast and prostate cancer screening: assessing the implications.

    Science.gov (United States)

    Chowdhury, Susmita; Dent, Tom; Pashayan, Nora; Hall, Alison; Lyratzopoulos, Georgios; Hallowell, Nina; Hall, Per; Pharoah, Paul; Burton, Hilary

    2013-06-01

    Individual risk prediction and stratification based on polygenic profiling may be useful in disease prevention. Risk-stratified population screening based on multiple factors including a polygenic risk profile has the potential to be more efficient than age-stratified screening. In this article, we summarize the implications of personalized screening for breast and prostate cancers. We report the opinions of multidisciplinary international experts who have explored the scientific, ethical, and logistical aspects of stratified screening. We have identified (i) the need to recognize the benefits and harms of personalized screening as compared with existing screening methods, (ii) that the use of genetic data highlights complex ethical issues including discrimination against high-risk individuals by insurers and employers and patient autonomy in relation to genetic testing of minors, (iii) the need for transparency and clear communication about risk scores, about harms and benefits, and about reasons for inclusion and exclusion from the risk-based screening process, and (iv) the need to develop new professional competences and to assess cost-effectiveness and acceptability of stratified screening programs before implementation. We conclude that health professionals and stakeholders need to consider the implications of incorporating genetic information in intervention strategies for health-care planning in the future. PMID:23412607

  10. Analysis of the Thermotoga maritima genome combining a variety of sequence similarity and genome context tools

    OpenAIRE

    Kyrpides, Nikos C; Ouzounis, Christos A; Iliopoulos, Ioannis; Vonstein, Veronika; Overbeek, Ross

    2000-01-01

    The proliferation of genome sequence data has led to the development of a number of tools and strategies that facilitate computational analysis. These methods include the identification of motif patterns, membership of the query sequences in family databases, metabolic pathway involvement and gene proximity. We re-examined the completely sequenced genome of Thermotoga maritima by employing the combined use of the above methods. By analyzing all 1877 proteins encoded in this genome, we identif...

  11. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    Directory of Open Access Journals (Sweden)

    Raj Chari

    2006-01-01

    Full Text Available Array comparative genomic hybridization (array CGH is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development.

  12. Coex-Rank: An approach incorporating co-expression information for combined analysis of microarray data

    OpenAIRE

    Cai, Jinlu; Keen, Henry L.; Sigmund, Curt D.; Casavant, Thomas L.

    2012-01-01

    Microarrays have been widely used to study differential gene expression at the genomic level. They can also provide genome-wide co-expression information. Biologically related datasets from independent studies are publicly available, which requires robust combined approaches for integration and validation. Previously, meta-analysis has been adopted to solve this problem.

  13. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

    Science.gov (United States)

    Zheng, Wenning; Mutha, Naresh V.R.; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S.; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah

    2016-01-01

    Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950

  14. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

    Science.gov (United States)

    Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

    2016-01-01

    Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950

  15. Barcode server: a visualization-based genome analysis system.

    Directory of Open Access Journals (Sweden)

    Fenglou Mao

    Full Text Available We have previously developed a computational method for representing a genome as a barcode image, which makes various genomic features visually apparent. We have demonstrated that this visual capability has made some challenging genome analysis problems relatively easy to solve. We have applied this capability to a number of challenging problems, including (a identification of horizontally transferred genes, (b identification of genomic islands with special properties and (c binning of metagenomic sequences, and achieved highly encouraging results. These application results inspired us to develop this barcode-based genome analysis server for public service, which supports the following capabilities: (a calculation of the k-mer based barcode image for a provided DNA sequence; (b detection of sequence fragments in a given genome with distinct barcodes from those of the majority of the genome, (c clustering of provided DNA sequences into groups having similar barcodes; and (d homology-based search using Blast against a genome database for any selected genomic regions deemed to have interesting barcodes. The barcode server provides a job management capability, allowing processing of a large number of analysis jobs for barcode-based comparative genome analyses. The barcode server is accessible at http://csbl1.bmb.uga.edu/Barcode.

  16. Barcode Server: A Visualization-Based Genome Analysis System

    Science.gov (United States)

    Mao, Fenglou; Olman, Victor; Wang, Yan; Xu, Ying

    2013-01-01

    We have previously developed a computational method for representing a genome as a barcode image, which makes various genomic features visually apparent. We have demonstrated that this visual capability has made some challenging genome analysis problems relatively easy to solve. We have applied this capability to a number of challenging problems, including (a) identification of horizontally transferred genes, (b) identification of genomic islands with special properties and (c) binning of metagenomic sequences, and achieved highly encouraging results. These application results inspired us to develop this barcode-based genome analysis server for public service, which supports the following capabilities: (a) calculation of the k-mer based barcode image for a provided DNA sequence; (b) detection of sequence fragments in a given genome with distinct barcodes from those of the majority of the genome, (c) clustering of provided DNA sequences into groups having similar barcodes; and (d) homology-based search using Blast against a genome database for any selected genomic regions deemed to have interesting barcodes. The barcode server provides a job management capability, allowing processing of a large number of analysis jobs for barcode-based comparative genome analyses. The barcode server is accessible at http://csbl1.bmb.uga.edu/Barcode. PMID:23457606

  17. SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

    Directory of Open Access Journals (Sweden)

    Davies Jonathan J

    2006-12-01

    Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.

  18. Pathway Analysis Incorporating Protein-Protein Interaction Networks Identified Candidate Pathways for the Seven Common Diseases.

    Science.gov (United States)

    Lin, Peng-Lin; Yu, Ya-Wen; Chung, Ren-Hua

    2016-01-01

    Pathway analysis has become popular as a secondary analysis strategy for genome-wide association studies (GWAS). Most of the current pathway analysis methods aggregate signals from the main effects of single nucleotide polymorphisms (SNPs) in genes within a pathway without considering the effects of gene-gene interactions. However, gene-gene interactions can also have critical effects on complex diseases. Protein-protein interaction (PPI) networks have been used to define gene pairs for the gene-gene interaction tests. Incorporating the PPI information to define gene pairs for interaction tests within pathways can increase the power for pathway-based association tests. We propose a pathway association test, which aggregates the interaction signals in PPI networks within a pathway, for GWAS with case-control samples. Gene size is properly considered in the test so that genes do not contribute more to the test statistic simply due to their size. Simulation studies were performed to verify that the method is a valid test and can have more power than other pathway association tests in the presence of gene-gene interactions within a pathway under different scenarios. We applied the test to the Wellcome Trust Case Control Consortium GWAS datasets for seven common diseases. The most significant pathway is the chaperones modulate interferon signaling pathway for Crohn's disease (p-value = 0.0003). The pathway modulates interferon gamma, which induces the JAK/STAT pathway that is involved in Crohn's disease. Several other pathways that have functional implications for the seven diseases were also identified. The proposed test based on gene-gene interaction signals in PPI networks can be used as a complementary tool to the current existing pathway analysis methods focusing on main effects of genes. An efficient software implementing the method is freely available at http://puppi.sourceforge.net. PMID:27622767

  19. caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data

    OpenAIRE

    Xuan Jianhua; Wang Zuyi; Miller David J; Li Huai; Zhu Yitan; Clarke Robert; Hoffman Eric P; Wang Yue

    2008-01-01

    Abstract Background The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables. Results In an ...

  20. Privacy-preserving GWAS analysis on federated genomic datasets

    OpenAIRE

    Constable, Scott D; Tang, Yuzhe; Wang, Shuang; Jiang, Xiaoqian; Chapin, Steve

    2015-01-01

    Background The biomedical community benefits from the increasing availability of genomic data to support meaningful scientific research, e.g., Genome-Wide Association Studies (GWAS). However, high quality GWAS usually requires a large amount of samples, which can grow beyond the capability of a single institution. Federated genomic data analysis holds the promise of enabling cross-institution collaboration for effective GWAS, but it raises concerns about patient privacy and medical informatio...

  1. Massive comparative genomic analysis reveals convergent evolution of specialized bacteria

    OpenAIRE

    Raoult Didier; Pontarotti Pierre; Royer-Carenzi Manuela; Merhej Vicky

    2009-01-01

    Abstract Background Genome size and gene content in bacteria are associated with their lifestyles. Obligate intracellular bacteria (i.e., mutualists and parasites) have small genomes that derived from larger free-living bacterial ancestors; however, the different steps of bacterial specialization from free-living to intracellular lifestyle have not been studied comprehensively. The growing number of available sequenced genomes makes it possible to perform a statistical comparative analysis of...

  2. A novel statistic for genome-wide interaction analysis.

    OpenAIRE

    Xuesen Wu; Hua Dong (Eds); Li Luo; Yun Zhu; Gang Peng; Reveille, John D.; Momiao Xiong

    2010-01-01

    Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide inte...

  3. BPGA- an ultra-fast pan-genome analysis pipeline.

    Science.gov (United States)

    Chaudhari, Narendrakumar M; Gupta, Vinod Kumar; Dutta, Chitra

    2016-01-01

    Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG &COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains. PMID:27071527

  4. GENOME ANALYSIS OF BURKHOLDERIA CEPACIA AC1100

    Science.gov (United States)

    Burkholderia cepacia is an important organism in bioremediation of environmental pollutants and it is also of increasing interest as a human pathogen. The genomic organization of B. cepacia is being studied in order to better understand its unusual adaptive capacity and genome pl...

  5. Pathway and network analysis of cancer genomes

    DEFF Research Database (Denmark)

    Creixell, Pau; Reimand, Jueri; Haider, Syed;

    2015-01-01

    Genomic information on tumors from 50 cancer types cataloged by the International Cancer Genome Consortium (ICGC) shows that only a few well-studied driver genes are frequently mutated, in contrast to many infrequently mutated genes that may also contribute to tumor biology. Hence there has been ...

  6. Genome analysis methods - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods Genome analysis methods... Data detail Data name Genome analysis methods Description of data contents The current status and re...ion of genomic database are shown in this list. Data file File name: pgdbj_dna_marker_linkage_map_genome_analysis_methods...r-linkage-map/LATEST/pgdbj_dna_marker_linkage_map_genome_analysis_methods_en.zip File size: 5.8 KB Simple se...arch URL http://togodb.biosciencedbc.jp/togodb/view/pgdbj_dna_marker_linkage_map_genome_analysis_methods_en

  7. Comparative analysis of plant genome architecture

    International Nuclear Information System (INIS)

    It is clear that there are close, family wide similarities between different crop species in both the genes (often with only allelic differences) and the gene order along chromosomes. However, there are extensive differences in both the type and organization of repetitive DNA, even between related species, which may be of importance for genome changes and the exchange of genes in both long (evolutionary) and short (plant breeding) time-scales. There is additional non-genic information in a genome, related to the methylation and coiling of sequences, and to the three-dimensional organization of these sequences in the nucleus. Highly repetitive DNA makes up the majority of most plant genomes. Some sequences, such as microsatellites, are similar in every organism, while other repeat units are specific to one species or a small group of species. Different sequences have characteristic genomic distribution, and most can be identified by their chromosomal distribution. Knowledge of the genome architecture - the organization and the nature of repetitive sequences, and the three-dimensional organization in the interphase nucleus - is likely to be helpful for applied research and plant breeding. There is little knowledge of why repetitive sequences have particular characteristic. Is the three-dimensional architecture of the nucleus important for functions? Do repetitive sequences put coding or regulatory sequences in particular nuclear position? Why are different sequences located at particular sites in the genome? A comprehensive and quantitative model is being constructed of the variable and constant parts of the plant genome. Such integrated models of large scale genome organization may be useful in learning the function of different components of the genome, and in evolutionary studies. Since repetitive DNA changes are frequent, perhaps one can learn more about which manipulations are possible in plant genomes by examining the changes already made between related

  8. Analysis of brushless DC generator incorporating an axial field coil

    Energy Technology Data Exchange (ETDEWEB)

    Moradi, Hassan, E-mail: H_moradi@sbu.ac.i [Department of Electrical and Computer Engineering, Shahid Beheshti University, GC, Tehran (Iran, Islamic Republic of); Afjei, E. [Department of Electrical and Computer Engineering, Shahid Beheshti University, GC, Tehran (Iran, Islamic Republic of)

    2011-07-15

    Highlights: {yields} Magnetic analysis and experiment of a three-phase field assisted BLDC generator. {yields} Confirm the accuracy of the predicted flux-linkage by 2-D FE analysis. {yields} Confirm the accuracy of the FE analysis results by coupling the FE and BE method. {yields} Control the output voltage to a desired level by control the amplitude of the I{sub f}. {yields} Compatible with any application that requires variable speed operation. -- Abstract: This paper describes the magnetic analysis and experiment of a three-phase field assisted brushless DC (BLDC) generator. Unlike conventional BLDC generators, the permanent magnet is replaced with an assisted field winding. The stator and rotor are constructed with two dependent magnetically sets, in which each stator set includes nine salient poles with coil windings, and the rotor comprises of six salient poles. Other pole combinations also are possible. This construction is similar to a homopolar inductor alternator. The DC current in the assisted field winding produces axial flux which makes the rotor magnetically polarized at its ends. The magnetic field flows axially through the rotor shaft and closes through the stator teeth and the machine housing. To evaluate the generator performance, two types of analysis, namely the numerical technique and the experimental study have been utilized. In the numerical analysis, 2-D finite element (FE) analysis has been carried out using a MagNet CAD package (Infolytica Corporation Ltd.), to confirm the accuracy of the predicted flux-linkage characteristics, whereas in the experimental study, a prototype BLDC generator was constructed for verifying the actual performance. Furthermore, the evaluation method based on a hybrid numerical method coupling the finite element (FE) analysis and boundary element (BE) method, has been carried out to confirm the accuracy of the 2-D FE analysis simulation results. It provides not only confirmations of the investigation in results

  9. Analysis of brushless DC generator incorporating an axial field coil

    International Nuclear Information System (INIS)

    Highlights: → Magnetic analysis and experiment of a three-phase field assisted BLDC generator. → Confirm the accuracy of the predicted flux-linkage by 2-D FE analysis. → Confirm the accuracy of the FE analysis results by coupling the FE and BE method. → Control the output voltage to a desired level by control the amplitude of the If. → Compatible with any application that requires variable speed operation. -- Abstract: This paper describes the magnetic analysis and experiment of a three-phase field assisted brushless DC (BLDC) generator. Unlike conventional BLDC generators, the permanent magnet is replaced with an assisted field winding. The stator and rotor are constructed with two dependent magnetically sets, in which each stator set includes nine salient poles with coil windings, and the rotor comprises of six salient poles. Other pole combinations also are possible. This construction is similar to a homopolar inductor alternator. The DC current in the assisted field winding produces axial flux which makes the rotor magnetically polarized at its ends. The magnetic field flows axially through the rotor shaft and closes through the stator teeth and the machine housing. To evaluate the generator performance, two types of analysis, namely the numerical technique and the experimental study have been utilized. In the numerical analysis, 2-D finite element (FE) analysis has been carried out using a MagNet CAD package (Infolytica Corporation Ltd.), to confirm the accuracy of the predicted flux-linkage characteristics, whereas in the experimental study, a prototype BLDC generator was constructed for verifying the actual performance. Furthermore, the evaluation method based on a hybrid numerical method coupling the finite element (FE) analysis and boundary element (BE) method, has been carried out to confirm the accuracy of the 2-D FE analysis simulation results. It provides not only confirmations of the investigation in results but also exact illustration for

  10. Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information

    Directory of Open Access Journals (Sweden)

    Aguilar Ignacio

    2011-01-01

    Full Text Available Abstract Background The incorporation of genomic coefficients into the numerator relationship matrix allows estimation of breeding values using all phenotypic, pedigree and genomic information simultaneously. In such a single-step procedure, genomic and pedigree-based relationships have to be compatible. As there are many options to create genomic relationships, there is a question of which is optimal and what the effects of deviations from optimality are. Methods Data of litter size (total number born per litter for 338,346 sows were analyzed. Illumina PorcineSNP60 BeadChip genotypes were available for 1,989. Analyses were carried out with the complete data set and with a subset of genotyped animals and three generations pedigree (5,090 animals. A single-trait animal model was used to estimate variance components and breeding values. Genomic relationship matrices were constructed using allele frequencies equal to 0.5 (G05, equal to the average minor allele frequency (GMF, or equal to observed frequencies (GOF. A genomic matrix considering random ascertainment of allele frequencies was also used (GOF*. A normalized matrix (GN was obtained to have average diagonal coefficients equal to 1. The genomic matrices were combined with the numerator relationship matrix creating H matrices. Results In G05 and GMF, both diagonal and off-diagonal elements were on average greater than the pedigree-based coefficients. In GOF and GOF*, the average diagonal elements were smaller than pedigree-based coefficients. The mean of off-diagonal coefficients was zero in GOF and GOF*. Choices of G with average diagonal coefficients different from 1 led to greater estimates of additive variance in the smaller data set. The correlation between EBV and genomic EBV (n = 1,989 were: 0.79 using G05, 0.79 using GMF, 0.78 using GOF, 0.79 using GOF*, and 0.78 using GN. Accuracies calculated by inversion increased with all genomic matrices. The accuracies of genomic-assisted EBV

  11. Incorporating Semantics into Data Driven Workflows for Content Based Analysis

    Science.gov (United States)

    Argüello, M.; Fernandez-Prieto, M. J.

    Finding meaningful associations between text elements and knowledge structures within clinical narratives in a highly verbal domain, such as psychiatry, is a challenging goal. The research presented here uses a small corpus of case histories and brings into play pre-existing knowledge, and therefore, complements other approaches that use large corpus (millions of words) and no pre-existing knowledge. The paper describes a variety of experiments for content-based analysis: Linguistic Analysis using NLP-oriented approaches, Sentiment Analysis, and Semantically Meaningful Analysis. Although it is not standard practice, the paper advocates providing automatic support to annotate the functionality as well as the data for each experiment by performing semantic annotation that uses OWL and OWL-S. Lessons learnt can be transmitted to legacy clinical databases facing the conversion of clinical narratives according to prominent Electronic Health Records standards.

  12. Analysis of recent segmental duplications in the bovine genome

    Science.gov (United States)

    Duplicated sequences are an important source of gene innovation and structural variation within mammalian genomes. We describe the first systematic and genome-wide analysis of segmental duplications in the modern domesticated cattle (Bos taurus). Using two distinct computational analyses, we estimat...

  13. GenomePeek—an online tool for prokaryotic genome and metagenome analysis

    Directory of Open Access Journals (Sweden)

    Katelyn McNair

    2015-06-01

    Full Text Available As more and more prokaryotic sequencing takes place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.

  14. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    Science.gov (United States)

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  15. Analysis of Simple Sequence Repeats in Genomes of Rhizobia

    Institute of Scientific and Technical Information of China (English)

    GAO Ya-mei; HAN Yi-qiang; TANG Hui; SUN Dong-mei; WANG Yan-jie; WANG Wei-dong

    2008-01-01

    Simple sequence repeats (SSRs) or microsatellites, as genetic markers, are ubiquitous in genomes of various organisms. The analysis of SSR in rhizobia genome provides useful information for a variety of applications in population genetics of rhizobia. We analyzed the occurrences, relative abundance, and relative density of SSRs, the most common in Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti genomes se-quenced in the microorganisms tandem repeats database, and SSRs in the three species genomes were compared with each other. The result showed that there were 1 410, 859, and 638 SSRs in B. japonicum, M. loti, and 5. meliloti genomes, respectively. In the genomes of B. japonicum, M. loti, and 5. meliloti, tetranucleotide, pentanucleotide, and hexanucleotide repeats were more abundant and indicated higher mutation rates in these species. The least abundance was mononucleotide repeat. The SSRs type and distribution were similar among these species.

  16. A methodology to incorporate organizational factors into human reliability analysis

    International Nuclear Information System (INIS)

    A new holistic methodology for Human Reliability Analysis (HRA) is proposed to model the effects of the organizational factors on the human reliability. Firstly, a conceptual framework is built, which is used to analyze the causal relationships between the organizational factors and human reliability. Then, the inference model for Human Reliability Analysis is built by combining the conceptual framework with Bayesian networks, which is used to execute the causal inference and diagnostic inference of human reliability. Finally, a case example is presented to demonstrate the specific application of the proposed methodology. The results show that the proposed methodology of combining the conceptual model with Bayesian Networks can not only easily model the causal relationship between organizational factors and human reliability, but in a given context, people can quantitatively measure the human operational reliability, and identify the most likely root causes or the prioritization of root causes caused human error. (authors)

  17. World-Systems Analysis, Globalization, and Incorporated Comparison

    OpenAIRE

    Phillip McMichael

    2015-01-01

    When Immanuel Wallerstein (1974) subverted the mid-1970s social science scene with his concept of the world-system, development, the master concept of social theory, suffered a fatal blow. Wallersteins critique of development emphasized its misapplication as a national strategy in a hierarchical world where only some states can succeed. Wallersteins path-breaking epistemological challenge to the modernization paradigm reformulated the unit of analysis of development from the nation-state to t...

  18. Human · mouse genome analysis and radiation biology. Proceedings

    International Nuclear Information System (INIS)

    This issue is the collection of the papers presented at the 25th NIRS symposium on Human, Mouse Genome Analysis and Radiation Biology. The 14 of the presented papers are indexed individually. (J.P.N.)

  19. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2007-09-01

    Full Text Available Abstract Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements

  20. Incorporating Concomitant Medications into Genome-Wide Analyses for the Study of Complex Disease and Drug Response

    Science.gov (United States)

    Graham, Hillary T.; Rotroff, Daniel M.; Marvel, Skylar W.; Buse, John B.; Havener, Tammy M.; Wilson, Alyson G.; Wagner, Michael J.; Motsinger-Reif, Alison A.; Friedewald, W.T.

    2016-01-01

    Given the high costs of conducting a drug-response trial, researchers are now aiming to use retrospective analyses to conduct genome-wide association studies (GWAS) to identify underlying genetic contributions to drug-response variation. To prevent confounding results from a GWAS to investigate drug response, it is necessary to account for concomitant medications, defined as any medication taken concurrently with the primary medication being investigated. We use data from the Action to Control Cardiovascular Disease (ACCORD) trial in order to implement a novel scoring procedure for incorporating concomitant medication information into a linear regression model in preparation for GWAS. In order to accomplish this, two primary medications were selected: thiazolidinediones and metformin because of the wide-spread use of these medications and large sample sizes available within the ACCORD trial. A third medication, fenofibrate, along with a known confounding medication, statin, were chosen as a proof-of-principle for the scoring procedure. Previous studies have identified SNP rs7412 as being associated with statin response. Here we hypothesize that including the score for statin as a covariate in the GWAS model will correct for confounding of statin and yield a change in association at rs7412. The response of the confounded signal was successfully diminished from p = 3.19 × 10−7 to p = 1.76 × 10−5, by accounting for statin using the scoring procedure presented here. This approach provides the ability for researchers to account for concomitant medications in complex trial designs where monotherapy treatment regimens are not available.

  1. World-Systems Analysis, Globalization, and Incorporated Comparison

    Directory of Open Access Journals (Sweden)

    Phillip McMichael

    2015-08-01

    Full Text Available When Immanuel Wallerstein (1974 subverted the mid-1970s social science scene with his concept of the ‘world-system,’ development, the ‘master’ concept of social theory, suffered a fatal blow. Wallerstein’s critique of development emphasized its misapplication as a national strategy in a hierarchical world where only some states can ‘succeed.’ Wallerstein’s path-breaking epistemological challenge to the modernization paradigm reformulated the unit of analysis of development from the nation-state to the ‘world-system.’ To be sure, the past three decades have seen reformulations, coined to address the failures of the development enterprise: frombasic needs, through participation in the world market, globalization, to local sustainability. But development, the organizing myth of our age, has never recovered.

  2. Mycobacterial species as case-study of comparative genome analysis.

    Science.gov (United States)

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-01-01

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species. PMID:21396338

  3. An analysis framework for radioactive waste management incorporating expert judgment

    International Nuclear Information System (INIS)

    Because of the uncertainties in the contents of high level radioactive waste tanks and in their chemical and physical state, high level radioactive waste tank risk assessment has been a difficult challenge for decision-makers in the U.S. Department of Energy complex. In 1991, it became apparent that waste in the high level waste tanks at the Hanford Reservation in Washington State were retaining radiolytically generated hydrogen and releasing the gas during discrete gas release events. Measurements indicated that during some of these gas release events, the gas mixture in the tank headspace exceeded the lower flammability limit of hydrogen-air mixtures. Operations involving waste transfers and certain safety controls were suspended until the risks associated with gas retention and release in the tanks could be evaluated. Since some of the 177 high level waste tanks at Hanford were known to be leaking into the vadose zone, there was considerable urgency in resolving risk uncertainties so that the transfer of waste from the leaking tanks could be carried out. Initial models that were proposed for calculating risk required a massive data collection campaign before they could be applied to obtain quantitative risk parameters. Sandia National Laboratories developed a concept for an analysis framework that would involve expert elicitation of uncertainty distributions for parameters that influenced risk but that could not be readily measured. A feasibility study, completed in mid-1996, indicated that about 200 uncertainty distributions would have to be elicited from the experts. The development of the analysis framework began soon thereafter. Two series of expert elicitation workshops were carried out. The first, completed in May 1997, led to the completion of a risk model for single shell tank operations and safety controls. The tanks that had been identified as leakers were all among the 149 single shell tanks at Hanford. The second series of expert elicitation workshop

  4. Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

    Directory of Open Access Journals (Sweden)

    Velasco Riccardo

    2011-01-01

    Full Text Available Abstract Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae.

  5. Pan-genomic analysis provides insights into the genomic variation and evolution of Salmonella Paratyphi A.

    Directory of Open Access Journals (Sweden)

    Weili Liang

    Full Text Available Salmonella Paratyphi A (S. Paratyphi A is a highly adapted, human-specific pathogen that causes paratyphoid fever. Cases of paratyphoid fever have recently been increasing, and the disease is becoming a major public health concern, especially in Eastern and Southern Asia. To investigate the genomic variation and evolution of S. Paratyphi A, a pan-genomic analysis was performed on five newly sequenced S. Paratyphi A strains and two other reference strains. A whole genome comparison revealed that the seven genomes are collinear and that their organization is highly conserved. The high rate of substitutions in part of the core genome indicates that there are frequent homologous recombination events. Based on the changes in the pan-genome size and cluster number (both in the core functional genes and core pseudogenes, it can be inferred that the sharply increasing number of pseudogene clusters may have strong correlation with the inactivation of functional genes, and indicates that the S. Paratyphi A genome is being degraded.

  6. Genome sequencing and analysis of the model grass Brachypodium distachyon.

    Science.gov (United States)

    2010-02-11

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops. PMID:20148030

  7. Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome

    Directory of Open Access Journals (Sweden)

    Childs Kevin L

    2010-11-01

    Full Text Available Abstract Background A goal of the Bovine Genome Database (BGD; http://BovineGenome.org has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. Results and Discussion BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. Conclusions We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence.

  8. Hyperstructures, genome analysis and I-cells

    DEFF Research Database (Denmark)

    Amar, P.; Ballet, P.; Barlovatz-Meimon, G.; Benecke, A.; Bernot, G.; Bouligand, Y.; Bourguine, P.; Delaplace, F.; Delosme, J.M.; Demarty, M.; Fishov, I.; Fourmentin-Guilbert, J.; Fralick, J.; Giavitto, J.L.; Gleyse, B.; Godin, C.; Incitti, R.; Kepes, F.; Lange, C.; Le Sceller, L.; Loutellier, C.; Michel, O.; Molina, F.; Monnier, C.; Natowicz, R.; Norris, V.; Orange, N.; Pollard, H.; Raine, D.; Ripoll, C.; Rouviere-Yaniv, J.; Saier, M.; Soler, P.; Tambourin, P.; Thellier, M.; Tracqui, P.; Ussery, David; Vincent, J.C.; Vannier, J.P.; Wiggins, P.; Zemirline, A.

    2002-01-01

    New concepts may prove necessary to profit from the avalanche of sequence data on the genome, transcriptome, proteome and interactome and to relate this information to cell physiology. Here, we focus on the concept of large activity-based structures, or hyperstructures, in which a variety of type...

  9. Hyperstructures, genome analysis and I-cells

    DEFF Research Database (Denmark)

    Amar, P.; Ballet, P.; Barlovatz-Meimon, G.;

    2002-01-01

    New concepts may prove necessary to profit from the avalanche of sequence data on the genome, transcriptome, proteome and interactome and to relate this information to cell physiology. Here, we focus on the concept of large activity-based structures, or hyperstructures, in which a variety of types...

  10. Comparative analysis of the mitochondrial genomes in gastropods

    International Nuclear Information System (INIS)

    In this work we presented a comparative analysis of the mitochondrial genomes in gastropods. Nucleotide and amino acids composition was calculated and a comparative visual analysis of the start and termination codons was performed. The organization of the genome was compared calculating the number of intergenic sequences, the location of the genes and the number of reorganized genes (breakpoints) in comparison with the sequence that is presumed to be ancestral for the group. In order to calculate variations in the rates of molecular evolution within the group, the relative rate test was performed. In spite of the differences in the size of the genomes, the amino acids number is conserved. The nucleotide and amino acid composition is similar between Vetigastropoda, Ceanogastropoda and Neritimorpha in comparison to Heterobranchia and Patellogastropoda. The mitochondrial genomes of the group are very compact with few intergenic sequences, the only exception is the genome of Patellogastropoda with 26,828 bp. Start codons of the Heterobranchia and Patellogastropoda are very variable and there is also an increase in genome rearrangements for these two groups. Generally, the hypothesis of constant rates of molecular evolution between the groups is rejected, except when the genomes of Caenogastropoda and Vetigastropoda are compared.

  11. Genome analysis of the platypus reveals unique signatures of evolution

    OpenAIRE

    Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.

    2008-01-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-o...

  12. Differential DNA Methylation Analysis without a Reference Genome

    OpenAIRE

    Johanna Klughammer; Paul Datlinger; Dieter Printz; Nathan C. Sheffield; Matthias Farlik; Johanna Hadler; Gerhard Fritsch; Christoph Bock

    2015-01-01

    Summary Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS), which...

  13. Genome complexity reduction for SNP genotyping analysis

    OpenAIRE

    Jordan, Barbara; Charest, Alain; Dowd, John F.; Blumenstiel, Justin P.; Yeh, Ru-Fang; Osman, Asiah; Housman, David E.; Landers, John E.

    2002-01-01

    Efficient single nucleotide polymorphism (SNP) genotyping methods are necessary to accomplish many current gene discovery goals. A crucial element in large-scale SNP genotyping is the number of individual biochemical reactions that must be performed. An efficient method that can be used to simultaneously amplify a set of genetic loci across a genome with high reliability can provide a valuable tool for large-scale SNP genotyping studies. In this paper we describe and characterize a method tha...

  14. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Directory of Open Access Journals (Sweden)

    Qin Xiang

    2012-07-01

    Full Text Available Abstract Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA strains (including STs 16, 17, 18, and 78, in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains

  15. A novel statistic for genome-wide interaction analysis.

    Directory of Open Access Journals (Sweden)

    Xuesen Wu

    2010-09-01

    Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001genome-wide interaction analysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

  16. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  17. Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions

    Directory of Open Access Journals (Sweden)

    Villegas Andre

    2010-09-01

    Full Text Available Abstract Background The pan-genome of a bacterial species consists of a core and an accessory gene pool. The accessory genome is thought to be an important source of genetic variability in bacterial populations and is gained through lateral gene transfer, allowing subpopulations of bacteria to better adapt to specific niches. Low-cost and high-throughput sequencing platforms have created an exponential increase in genome sequence data and an opportunity to study the pan-genomes of many bacterial species. In this study, we describe a new online pan-genome sequence analysis program, Panseq. Results Panseq was used to identify Escherichia coli O157:H7 and E. coli K-12 genomic islands. Within a population of 60 E. coli O157:H7 strains, the existence of 65 accessory genomic regions identified by Panseq analysis was confirmed by PCR. The accessory genome and binary presence/absence data, and core genome and single nucleotide polymorphisms (SNPs of six L. monocytogenes strains were extracted with Panseq and hierarchically clustered and visualized. The nucleotide core and binary accessory data were also used to construct maximum parsimony (MP trees, which were compared to the MP tree generated by multi-locus sequence typing (MLST. The topology of the accessory and core trees was identical but differed from the tree produced using seven MLST loci. The Loci Selector module found the most variable and discriminatory combinations of four loci within a 100 loci set among 10 strains in 1 s, compared to the 449 s required to exhaustively search for all possible combinations; it also found the most discriminatory 20 loci from a 96 loci E. coli O157:H7 SNP dataset. Conclusion Panseq determines the core and accessory regions among a collection of genomic sequences based on user-defined parameters. It readily extracts regions unique to a genome or group of genomes, identifies SNPs within shared core genomic regions, constructs files for use in phylogeny programs

  18. Differential DNA Methylation Analysis without a Reference Genome

    Directory of Open Access Journals (Sweden)

    Johanna Klughammer

    2015-12-01

    Full Text Available Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS, which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish. Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org. The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome.

  19. Meta-analysis for pathway enrichment analysis when combining multiple genomic studies

    OpenAIRE

    Shen, Kui; Tseng, George C.

    2010-01-01

    Motivation: Many pathway analysis (or gene set enrichment analysis) methods have been developed to identify enriched pathways under different biological states within a genomic study. As more and more microarray datasets accumulate, meta-analysis methods have also been developed to integrate information among multiple studies. Currently, most meta-analysis methods for combining genomic studies focus on biomarker detection and meta-analysis for pathway analysis has not been systematically purs...

  20. Genomic Analysis of Companion Rabbit Staphylococcus aureus

    Science.gov (United States)

    Holmes, Mark A.; Harrison, Ewan M.; Fisher, Elizabeth A.; Graham, Elizabeth M.; Parkhill, Julian; Foster, Geoffrey; Paterson, Gavin K.

    2016-01-01

    In addition to being an important human pathogen, Staphylococcus aureus is able to cause a variety of infections in numerous other host species. While the S. aureus strains causing infection in several of these hosts have been well characterised, this is not the case for companion rabbits (Oryctolagus cuniculus), where little data are available on S. aureus strains from this host. To address this deficiency we have performed antimicrobial susceptibility testing and genome sequencing on a collection of S. aureus isolates from companion rabbits. The findings show a diverse S. aureus population is able to cause infection in this host, and while antimicrobial resistance was uncommon, the isolates possess a range of known and putative virulence factors consistent with a diverse clinical presentation in companion rabbits including severe abscesses. We additionally show that companion rabbit isolates carry polymorphisms within dltB as described as underlying host-adaption of S. aureus to farmed rabbits. The availability of S. aureus genome sequences from companion rabbits provides an important aid to understanding the pathogenesis of disease in this host and in the clinical management and surveillance of these infections. PMID:26963381

  1. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.

  2. Sequencing and Analysis of a Genomic Fragment Provide an Insight into the Dunaliella viridis Genomic Sequence

    Institute of Scientific and Technical Information of China (English)

    Xiao-Ming SUN; Yuan-Ping TANG; Xiang-Zong MENG; Wen-Wen ZHANG; Shan LI; Zhi-Rui DENG; Zheng-Kai XU; Ren-Tao SONG

    2006-01-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)n type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features.

  3. Sequencing and analysis of a genomic fragment provide an insight into the Dunaliella viridis genomic sequence.

    Science.gov (United States)

    Sun, Xiao-Ming; Tang, Yuan-Ping; Meng, Xiang-Zong; Zhang, Wen-Wen; Li, Shan; Deng, Zhi-Rui; Xu, Zheng-Kai; Song, Ren-Tao

    2006-11-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)(n) type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features. PMID:17091199

  4. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    Science.gov (United States)

    Zheng, Wenning; Tan, Tze King; Paterson, Ian C; Mutha, Naresh V R; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my. PMID:27138013

  5. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform

    Science.gov (United States)

    Zheng, Wenning; Paterson, Ian C.; Mutha, Naresh V. R.; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A.; Jakubovics, Nicholas S.; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my. PMID:27138013

  6. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    Directory of Open Access Journals (Sweden)

    Wenning Zheng

    Full Text Available The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC tool and Pathogenomic Profiling Tool (PathoProT, which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.

  7. Genome-wide identification of the regulatory targets of a transcription factor using biochemical characterization and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Jolly Emmitt R

    2005-11-01

    Full Text Available Abstract Background A major challenge in computational genomics is the development of methodologies that allow accurate genome-wide prediction of the regulatory targets of a transcription factor. We present a method for target identification that combines experimental characterization of binding requirements with computational genomic analysis. Results Our method identified potential target genes of the transcription factor Ndt80, a key transcriptional regulator involved in yeast sporulation, using the combined information of binding affinity, positional distribution, and conservation of the binding sites across multiple species. We have also developed a mathematical approach to compute the false positive rate and the total number of targets in the genome based on the multiple selection criteria. Conclusion We have shown that combining biochemical characterization and computational genomic analysis leads to accurate identification of the genome-wide targets of a transcription factor. The method can be extended to other transcription factors and can complement other genomic approaches to transcriptional regulation.

  8. Coevolution of aah: A dps-Like Gene with the Host Bacterium Revealed by Comparative Genomic Analysis

    Directory of Open Access Journals (Sweden)

    Liyan Ping

    2012-01-01

    Full Text Available A protein named AAH was isolated from the bacterium Microbacterium arborescens SE14, a gut commensal of the lepidopteran larvae. It showed not only a high sequence similarity to Dps-like proteins (DNA-binding proteins from starved cell but also reversible hydrolase activity. A comparative genomic analysis was performed to gain more insights into its evolution. The GC profile of the aah gene indicated that it was evolved from a low GC ancestor. Its stop codon usage was also different from the general pattern of Actinobacterial genomes. The phylogeny of dps-like proteins showed strong correlation with the phylogeny of host bacteria. A conserved genomic synteny was identified in some taxonomically related Actinobacteria, suggesting that the ancestor genes had incorporated into the genome before the divergence of Micrococcineae from other families. The aah gene had evolved new function but still retained the typical dodecameric structure.

  9. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  10. Evolution Analysis of Simple Sequence Repeats in Plant Genome.

    Directory of Open Access Journals (Sweden)

    Zhen Qin

    Full Text Available Simple sequence repeats (SSRs are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens. With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.

  11. Sequencing and analysis of the giant panda genome

    Institute of Scientific and Technical Information of China (English)

    YANG HuanMing

    2010-01-01

    @@ The giant panda (Ailuropoda melanoleuca) is loved all over the world and is considered a symbol of China, as illustrated by its being one of the mascots for the Beijing 2008 Olympic Games.It is also one of the world's most endangered animals and a flagship species for conservation.Using next-generation sequencing technology (Illumina Genome Analyzer) and our in-house assembly software, we have generated the first map of the giant panda genome sequence.This map will provide an unparalleled amount of information to aid in understanding the genetic and biological nature of this unique species and will contribute significantly to disease control and conservation efforts for this endangered species.In March 2008, the giant panda genome sequencing and analysis project was started at the Beijing Genomics Institute (BGI) in Shenzhen with collaborators from the Kunming Institute of Zoology and the Chengdu Research Base of Giant Panda Breeding.On 21 Jan.2010, this collaboration resulted in the publication, as a cover story in the journal Nature, of the sequencing and analysis of the giant panda genome.

  12. Comparative analysis of methods for genome-wide nucleosome cartography.

    Science.gov (United States)

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. PMID:25296770

  13. Primer to analysis of genomic data using R

    CERN Document Server

    Gondro, Cedric

    2015-01-01

    Through this book, researchers and students will learn to use R for analysis of large-scale genomic data and how to create routines to automate analytical steps. The philosophy behind the book is to start with real world raw datasets and perform all the analytical steps needed to reach final results. Though theory plays an important role, this is a practical book for advanced undergraduate and graduate classes in bioinformatics, genomics and statistical genetics or for use in lab sessions. This book is also designed to be used by students in computer science and statistics who want to learn the practical aspects of genomic analysis without delving into algorithmic details. The datasets used throughout the book may be downloaded from the publisher’s website.  Chapters show how to handle and manage high-throughput genomic data, create automated workflows and speed up analyses in R. A wide range of R packages useful for working with genomic data are illustrated with practical examples. In recent years R has b...

  14. Genome-wide Studies of Mycolic Acid Bacteria: Computational Identification and Analysis of a Minimal Genome

    KAUST Repository

    Kamanu, Frederick Kinyua

    2012-12-01

    The mycolic acid bacteria are a distinct suprageneric group of asporogenous Grampositive, high GC-content bacteria, distinguished by the presence of mycolic acids in their cell envelope. They exhibit great diversity in their cell and morphology; although primarily non-pathogens, this group contains three major pathogens Mycobacterium leprae, Mycobacterium tuberculosis complex, and Corynebacterium diphtheria. Although the mycolic acid bacteria are a clearly defined group of bacteria, the taxonomic relationships between its constituent genera and species are less well defined. Two approaches were tested for their suitability in describing the taxonomy of the group. First, a Multilocus Sequence Typing (MLST) experiment was assessed and found to be superior to monophyletic (16S small ribosomal subunit) in delineating a total of 52 mycolic acid bacterial species. Phylogenetic inference was performed using the neighbor-joining method. To further refine phylogenetic analysis and to take advantage of the widespread availability of bacterial genome data, a computational framework that simulates DNA-DNA hybridisation was developed and validated using multiscale bootstrap resampling. The tool classifies microbial genomes based on whole genome DNA, and was deployed as a web-application using PHP and Javascript. It is accessible online at http://cbrc.kaust.edu.sa/dna_hybridization/ A third study was a computational and statistical methods in the identification and analysis of a putative minimal mycolic acid bacterial genome so as to better understand (1) the genomic requirements to encode a mycolic acid bacterial cell and (2) the role and type of genes and genetic elements that lead to the massive increase in genome size in environmental mycolic acid bacteria. Using a reciprocal comparison approach, a total of 690 orthologous gene clusters forming a putative minimal genome were identified across 24 mycolic acid bacterial species. In order to identify new potential drug

  15. Flow cytometric analysis of RNA synthesis by detection of bromouridine incorporation

    DEFF Research Database (Denmark)

    Larsen, J K; Jensen, Peter Østrup; Larsen, J

    2001-01-01

    RNA synthesis has traditionally been investigated by a laborious and time-consuming radiographic method involving incorporation of tritiated uridine. Now a faster non-radioactive alternative has emerged, based on immunocytochemical detection. This method utilizes the brominated RNA precursor...... bromouridine, which is taken into a cell, phosphorylated, and incorporated into nascent RNA. The BrU-substituted RNA is detected by permeabilizing the cells and staining with certain anti-BrdU antibodies. This dynamic approach yields information complementing that provided by cellular RNA content analysis at a...

  16. Genome-Wide Association Analysis in Primary Sclerosing Cholangitis

    NARCIS (Netherlands)

    Karlsen, Tom H.; Franke, Andre; Melum, Espen; Kaser, Arthur; Hov, Johannes Roksund; Balschun, Tobias; Lie, Benedicte A.; Bergquist, Annika; Schramm, Christoph; Weismueller, Tobias J.; Gotthardt, Daniel; Rust, Christian; Philipp, Eva E. R.; Fritz, Teresa; Henckaerts, Liesbet; Weersma, Rinse K.; Stokkers, Pieter; Ponsioen, Cyriel Y.; Wijmenga, Cisca; Sterneck, Martina; Nothnagel, Michael; Hampe, Jochen; Teufel, Andreas; Runz, Heiko; Rosenstiel, Philip; Stiehl, Adolf; Vermeire, Severine; Beuers, Ulrich; Manns, Michael P.; Schrumpf, Erik; Boberg, Kirsten Muri; Schreiber, Stefan

    2010-01-01

    BACKGROUND & AIMS: We aimed to characterize the genetic susceptibility to primary sclerosing cholangitis (PSC) by means of a genome-wide association analysis of single nucleotide polymorphism (SNP) markers. METHODS: A total of 443,816 SNPs on the Affymetrix SNP Array 5.0 (Affymetrix, Santa Clara, CA

  17. Comparative genome analysis and genome-guided physiological analysis of Roseobacter litoralis

    Directory of Open Access Journals (Sweden)

    Simon Meinhard

    2011-06-01

    Full Text Available Abstract Background Roseobacter litoralis OCh149, the type species of the genus, and Roseobacter denitrificans OCh114 were the first described organisms of the Roseobacter clade, an ecologically important group of marine bacteria. Both species were isolated from seaweed and are able to perform aerobic anoxygenic photosynthesis. Results The genome of R. litoralis OCh149 contains one circular chromosome of 4,505,211 bp and three plasmids of 93,578 bp (pRLO149_94, 83,129 bp (pRLO149_83 and 63,532 bp (pRLO149_63. Of the 4537 genes predicted for R. litoralis, 1122 (24.7% are not present in the genome of R. denitrificans. Many of the unique genes of R. litoralis are located in genomic islands and on plasmids. On pRLO149_83 several potential heavy metal resistance genes are encoded which are not present in the genome of R. denitrificans. The comparison of the heavy metal tolerance of the two organisms showed an increased zinc tolerance of R. litoralis. In contrast to R. denitrificans, the photosynthesis genes of R. litoralis are plasmid encoded. The activity of the photosynthetic apparatus was confirmed by respiration rate measurements, indicating a growth-phase dependent response to light. Comparative genomics with other members of the Roseobacter clade revealed several genomic regions that were only conserved in the two Roseobacter species. One of those regions encodes a variety of genes that might play a role in host association of the organisms. The catabolism of different carbon and nitrogen sources was predicted from the genome and combined with experimental data. In several cases, e.g. the degradation of some algal osmolytes and sugars, the genome-derived predictions of the metabolic pathways in R. litoralis differed from the phenotype. Conclusions The genomic differences between the two Roseobacter species are mainly due to lateral gene transfer and genomic rearrangements. Plasmid pRLO149_83 contains predominantly recently acquired genetic

  18. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

    Directory of Open Access Journals (Sweden)

    Huang Desheng

    2009-07-01

    Full Text Available Abstract Background A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. Methods Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. Results The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. Conclusion The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.

  19. Detecting Genomic Signatures of Natural Selection with Principal Component Analysis: Application to the 1000 Genomes Data.

    Science.gov (United States)

    Duforet-Frebourg, Nicolas; Luu, Keurcien; Laval, Guillaume; Bazin, Eric; Blum, Michael G B

    2016-04-01

    To characterize natural selection, various analytical methods for detecting candidate genomic regions have been developed. We propose to perform genome-wide scans of natural selection using principal component analysis (PCA). We show that the common FST index of genetic differentiation between populations can be viewed as the proportion of variance explained by the principal components. Considering the correlations between genetic variants and each principal component provides a conceptual framework to detect genetic variants involved in local adaptation without any prior definition of populations. To validate the PCA-based approach, we consider the 1000 Genomes data (phase 1) considering 850 individuals coming from Africa, Asia, and Europe. The number of genetic variants is of the order of 36 millions obtained with a low-coverage sequencing depth (3×). The correlations between genetic variation and each principal component provide well-known targets for positive selection (EDAR, SLC24A5, SLC45A2, DARC), and also new candidate genes (APPBPP2, TP1A1, RTTN, KCNMA, MYO5C) and noncoding RNAs. In addition to identifying genes involved in biological adaptation, we identify two biological pathways involved in polygenic adaptation that are related to the innate immune system (beta defensins) and to lipid metabolism (fatty acid omega oxidation). An additional analysis of European data shows that a genome scan based on PCA retrieves classical examples of local adaptation even when there are no well-defined populations. PCA-based statistics, implemented in the PCAdapt R package and the PCAdapt fast open-source software, retrieve well-known signals of human adaptation, which is encouraging for future whole-genome sequencing project, especially when defining populations is difficult. PMID:26715629

  20. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  1. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 complex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  2. BioMet Toolbox: genome-wide analysis of metabolism

    DEFF Research Database (Denmark)

    Cvijovic, M.; Olivares Hernandez, Roberto; Agren, R.;

    2010-01-01

    -based resource for stoichiometric analysis and for integration of transcriptome and interactome data, thereby exploiting the capabilities of genome-scale metabolic models. The BioMet Toolbox provides an effective user-friendly way to perform linear programming simulations towards maximized or minimized growth......The rapid progress of molecular biology tools for directed genetic modifications, accurate quantitative experimental approaches, high-throughput measurements, together with development of genome sequencing has made the foundation for a new area of metabolic engineering that is driven by metabolic...

  3. Genomic-Wide Analysis with Microarrays in Human Oncology

    Directory of Open Access Journals (Sweden)

    Kenichi Inaoka

    2015-10-01

    Full Text Available DNA microarray technologies have advanced rapidly and had a profound impact on examining gene expression on a genomic scale in research. This review discusses the history and development of microarray and DNA chip devices, and specific microarrays are described along with their methods and applications. In particular, microarrays have detected many novel cancer-related genes by comparing cancer tissues and non-cancerous tissues in oncological research. Recently, new methods have been in development, such as the double-combination array and triple-combination array, which allow more effective analysis of gene expression and epigenetic changes. Analysis of gene expression alterations in precancerous regions compared with normal regions and array analysis in drug-resistance cancer tissues are also successfully performed. Compared with next-generation sequencing, a similar method of genome analysis, several important differences distinguish these techniques and their applications. Development of novel microarray technologies is expected to contribute to further cancer research.

  4. Genome Assembly and Computational Analysis Pipelines for Bacterial Pathogens

    KAUST Repository

    Rangkuti, Farania Gama Ardhina

    2011-06-01

    Pathogens lie behind the deadliest pandemics in history. To date, AIDS pandemic has resulted in more than 25 million fatal cases, while tuberculosis and malaria annually claim more than 2 million lives. Comparative genomic analyses are needed to gain insights into the molecular mechanisms of pathogens, but the abundance of biological data dictates that such studies cannot be performed without the assistance of computational approaches. This explains the significant need for computational pipelines for genome assembly and analyses. The aim of this research is to develop such pipelines. This work utilizes various bioinformatics approaches to analyze the high-­throughput genomic sequence data that has been obtained from several strains of bacterial pathogens. A pipeline has been compiled for quality control for sequencing and assembly, and several protocols have been developed to detect contaminations. Visualization has been generated of genomic data in various formats, in addition to alignment, homology detection and sequence variant detection. We have also implemented a metaheuristic algorithm that significantly improves bacterial genome assemblies compared to other known methods. Experiments on Mycobacterium tuberculosis H37Rv data showed that our method resulted in improvement of N50 value of up to 9697% while consistently maintaining high accuracy, covering around 98% of the published reference genome. Other improvement efforts were also implemented, consisting of iterative local assemblies and iterative correction of contiguated bases. Our result expedites the genomic analysis of virulent genes up to single base pair resolution. It is also applicable to virtually every pathogenic microorganism, propelling further research in the control of and protection from pathogen-­associated diseases.

  5. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  6. GWAMA: software for genome-wide association meta-analysis

    OpenAIRE

    Mägi Reedik; Morris Andrew P

    2010-01-01

    Abstract Background Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages in...

  7. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  8. Genome analysis of the platypus reveals unique signatures of evolution

    Science.gov (United States)

    Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.

    2009-01-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734

  9. Genome analysis of the platypus reveals unique signatures of evolution.

    Science.gov (United States)

    Warren, Wesley C; Hillier, LaDeana W; Marshall Graves, Jennifer A; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P; Miethke, Pat; Waters, Paul D; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S; López-Otín, Carlos; Ordóñez, Gonzalo R; Eichler, Evan E; Chen, Lin; Cheng, Ze; Deakin, Janine E; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T; Wakefield, Matthew J; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A; Smit, Arian F A; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A; Walker, Jerilyn A; Konkel, Miriam K; Harris, Robert S; Whittington, Camilla M; Wong, Emily S W; Gemmell, Neil J; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R; Ray, David A; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H; Taylor, James; Jones, Russell C; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N; Pohl, Craig S; Smith, Scott M; Hou, Shunfeng; Nefedov, Mikhail; de Jong, Pieter J; Renfree, Marilyn B; Mardis, Elaine R; Wilson, Richard K

    2008-05-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734

  10. Comparative Analysis of Genome Diversity in Bullmastiff Dogs.

    Science.gov (United States)

    Mortlock, Sally-Anne; Khatkar, Mehar S; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  11. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) genome.

    OpenAIRE

    Byrappa Venkatesh; Kirkness, Ewen F.; Yong-Hwee Loh; Halpern, Aaron L; Lee, Alison P.; Justin Johnson; Nidhi Dandona; Viswanathan, Lakshmi D; Alice Tay; J Craig Venter; Strausberg, Robert L; Sydney Brenner

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of t...

  12. Hierarchical structure analysis describing abnormal base composition of genomes

    Science.gov (United States)

    Ouyang, Zhengqing; Liu, Jian-Kun; She, Zhen-Su

    2005-10-01

    Abnormal base compositional patterns of genomic DNA sequences are studied in the framework of a hierarchical structure (HS) model originally proposed for the study of fully developed turbulence [She and Lévêque, Phys. Rev. Lett. 72, 336 (1994)]. The HS similarity law is verified over scales between 103bp and 105bp , and the HS parameter β is proposed to describe the degree of heterogeneity in the base composition patterns. More than one hundred bacteria, archaea, virus, yeast, and human genome sequences have been analyzed and the results show that the HS analysis efficiently captures abnormal base composition patterns, and the parameter β is a characteristic measure of the genome. Detailed examination of the values of β reveals an intriguing link to the evolutionary events of genetic material transfer. Finally, a sequence complexity (S) measure is proposed to characterize gradual increase of organizational complexity of the genome during the evolution. The present study raises several interesting issues in the evolutionary history of genomes.

  13. Integrative prescreening in analysis of multiple cancer genomic studies

    Directory of Open Access Journals (Sweden)

    Song Rui

    2012-07-01

    Full Text Available Abstract Background In high throughput cancer genomic studies, results from the analysis of single datasets often suffer from a lack of reproducibility because of small sample sizes. Integrative analysis can effectively pool and analyze multiple datasets and provides a cost effective way to improve reproducibility. In integrative analysis, simultaneously analyzing all genes profiled may incur high computational cost. A computationally affordable remedy is prescreening, which fits marginal models, can be conducted in a parallel manner, and has low computational cost. Results An integrative prescreening approach is developed for the analysis of multiple cancer genomic datasets. Simulation shows that the proposed integrative prescreening has better performance than alternatives, particularly including prescreening with individual datasets, an intensity approach and meta-analysis. We also analyze multiple microarray gene profiling studies on liver and pancreatic cancers using the proposed approach. Conclusions The proposed integrative prescreening provides an effective way to reduce the dimensionality in cancer genomic studies. It can be coupled with existing analysis methods to identify cancer markers.

  14. Sequence analysis reveals mosaic genome of Aichi virus

    Directory of Open Access Journals (Sweden)

    Han Xiaohong

    2011-08-01

    Full Text Available Abstract Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity. Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood. The genome of Aichi virus contains 8,280 nucleotides and a poly(A tail. The single large open reading frame (nt 713-8014 according to the strain AB010145 encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  15. Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement.

    Science.gov (United States)

    Spindel, J E; Begum, H; Akdemir, D; Collard, B; Redoña, E; Jannink, J-L; McCouch, S

    2016-04-01

    To address the multiple challenges to food security posed by global climate change, population growth and rising incomes, plant breeders are developing new crop varieties that can enhance both agricultural productivity and environmental sustainability. Current breeding practices, however, are unable to keep pace with demand. Genomic selection (GS) is a new technique that helps accelerate the rate of genetic gain in breeding by using whole-genome data to predict the breeding value of offspring. Here, we describe a new GS model that combines RR-BLUP with markers fit as fixed effects selected from the results of a genome-wide-association study (GWAS) on the RR-BLUP training data. We term this model GS + de novo GWAS. In a breeding population of tropical rice, GS + de novo GWAS outperformed six other models for a variety of traits and in multiple environments. On the basis of these results, we propose an extended, two-part breeding design that can be used to efficiently integrate novel variation into elite breeding populations, thus expanding genetic diversity and enhancing the potential for sustainable productivity gains. PMID:26860200

  16. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  17. General metabolism of Laribacter hongkongensis: a genome-wide analysis

    Directory of Open Access Journals (Sweden)

    Curreem Shirly O

    2011-04-01

    Full Text Available Abstract Background Laribacter hongkongensis is associated with community-acquired gastroenteritis and traveler's diarrhea. In this study, we performed an in-depth annotation of the genes and pathways of the general metabolism of L. hongkongensis and correlated them with its phenotypic characteristics. Results The L. hongkongensis genome possesses the pentose phosphate and gluconeogenesis pathways and tricarboxylic acid and glyoxylate cycles, but incomplete Embden-Meyerhof-Parnas and Entner-Doudoroff pathways, in agreement with its asaccharolytic phenotype. It contains enzymes for biosynthesis and β-oxidation of saturated fatty acids, biosynthesis of all 20 universal amino acids and selenocysteine, the latter not observed in Neisseria gonorrhoeae, Neisseria meningitidis and Chromobacterium violaceum. The genome contains a variety of dehydrogenases, enabling it to utilize different substrates as electron donors. It encodes three terminal cytochrome oxidases for respiration using oxygen as the electron acceptor under aerobic and microaerophilic conditions and four reductases for respiration with alternative electron acceptors under anaerobic conditions. The presence of complete tetrathionate reductase operon may confer survival advantage in mammalian host in association with diarrhea. The genome contains CDSs for incorporating sulfur and nitrogen by sulfate assimilation, ammonia assimilation and nitrate reduction. The existence of both glutamate dehydrogenase and glutamine synthetase/glutamate synthase pathways suggests an importance of ammonia metabolism in the living environments that it may encounter. Conclusions The L. hongkongensis genome possesses a variety of genes and pathways for carbohydrate, amino acid and lipid metabolism, respiratory chain and sulfur and nitrogen metabolism. These allow the bacterium to utilize various substrates for energy production and survive in different environmental niches.

  18. Large-Scale Comparative Genomics Meta-Analysis of Campylobacter jejuni Isolates Reveals Low Level of Genome Plasticity

    OpenAIRE

    Taboada, Eduardo N.; Acedillo, Rey R; Carrillo, Catherine D.; Findlay, Wendy A.; Medeiros, Diane T.; Mykytczuk, Oksana L; Roberts, Michael J.; Valencia, C. Alexander; Farber, Jeffrey M.; Nash, John H E

    2004-01-01

    We have used comparative genomic hybridization (CGH) on a full-genome Campylobacter jejuni microarray to examine genome-wide gene conservation patterns among 51 strains isolated from food and clinical sources. These data have been integrated with data from three previous C. jejuni CGH studies to perform a meta-analysis that included 97 strains from the four separate data sets. Although many genes were found to be divergent across multiple strains (n = 350), many genes (n = 249) were uniquely ...

  19. Evaluation of a Phylogenetic Marker Based on Genomic Segment B of Infectious Bursal Disease Virus: Facilitating a Feasible Incorporation of this Segment to the Molecular Epidemiology Studies for this Viral Agent.

    Directory of Open Access Journals (Sweden)

    Abdulahi Alfonso-Morales

    Full Text Available Infectious bursal disease (IBD is a highly contagious and acute viral disease, which has caused high mortality rates in birds and considerable economic losses in different parts of the world for more than two decades and it still represents a considerable threat to poultry. The current study was designed to rigorously measure the reliability of a phylogenetic marker included into segment B. This marker can facilitate molecular epidemiology studies, incorporating this segment of the viral genome, to better explain the links between emergence, spreading and maintenance of the very virulent IBD virus (vvIBDV strains worldwide.Sequences of the segment B gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank Database; Cuban sequences were obtained in the current work. A phylogenetic marker named B-marker was assessed by different phylogenetic principles such as saturation of substitution, phylogenetic noise and high consistency. This last parameter is based on the ability of B-marker to reconstruct the same topology as the complete segment B of the viral genome. From the results obtained from B-marker, demographic history for both main lineages of IBDV regarding segment B was performed by Bayesian skyline plot analysis. Phylogenetic analysis for both segments of IBDV genome was also performed, revealing the presence of a natural reassortant strain with segment A from vvIBDV strains and segment B from non-vvIBDV strains within Cuban IBDV population.This study contributes to a better understanding of the emergence of vvIBDV strains, describing molecular epidemiology of IBDV using the state-of-the-art methodology concerning phylogenetic reconstruction. This study also revealed the presence of a novel natural reassorted strain as possible manifest of change in the genetic structure and stability of the vvIBDV strains. Therefore, it highlights the need to obtain information about both genome segments of IBDV for

  20. Meta-analysis of genome-wide association from genomic prediction models.

    Science.gov (United States)

    Bernal Rubio, Y L; Gualdrón Duarte, J L; Bates, R O; Ernst, C W; Nonneman, D; Rohrer, G A; King, A; Shackelford, S D; Wheeler, T L; Cantet, R J C; Steibel, J P

    2016-02-01

    Genome-wide association (GWA) studies based on GBLUP models are a common practice in animal breeding. However, effect sizes of GWA tests are small, requiring larger sample sizes to enhance power of detection of rare variants. Because of difficulties in increasing sample size in animal populations, one alternative is to implement a meta-analysis (MA), combining information and results from independent GWA studies. Although this methodology has been used widely in human genetics, implementation in animal breeding has been limited. Thus, we present methods to implement a MA of GWA, describing the proper approach to compute weights derived from multiple genomic evaluations based on animal-centric GBLUP models. Application to real datasets shows that MA increases power of detection of associations in comparison with population-level GWA, allowing for population structure and heterogeneity of variance components across populations to be accounted for. Another advantage of MA is that it does not require access to genotype data that is required for a joint analysis. Scripts related to the implementation of this approach, which consider the strength of association as well as the sign, are distributed and thus account for heterogeneity in association phase between QTL and SNPs. Thus, MA of GWA is an attractive alternative to summarizing results from multiple genomic studies, avoiding restrictions with genotype data sharing, definition of fixed effects and different scales of measurement of evaluated traits. PMID:26607299

  1. Genomic analysis and selected molecular pathways in rare cancers

    International Nuclear Information System (INIS)

    It is widely accepted that many cancers arise as a result of an acquired genomic instability and the subsequent evolution of tumor cells with variable patterns of selected and background aberrations. The presence and behaviors of distinct neoplastic cell populations within a patient's tumor may underlie multiple clinical phenotypes in cancers. A goal of many current cancer genome studies is the identification of recurring selected driver events that can be advanced for the development of personalized therapies. Unfortunately, in the majority of rare tumors, this type of analysis can be particularly challenging. Large series of specimens for analysis are simply not available, allowing recurring patterns to remain hidden. In this paper, we highlight the use of DNA content-based flow sorting to identify and isolate DNA-diploid and DNA-aneuploid populations from tumor biopsies as a strategy to comprehensively study the genomic composition and behaviors of individual cancers in a series of rare solid tumors: intrahepatic cholangiocarcinoma, anal carcinoma, adrenal leiomyosarcoma, and pancreatic neuroendocrine tumors. We propose that the identification of highly selected genomic events in distinct tumor populations within each tumor can identify candidate driver events that can facilitate the development of novel, personalized treatment strategies for patients with cancer. (paper)

  2. Analysis of radiation-induced genome alterations in Vigna unguiculata

    Directory of Open Access Journals (Sweden)

    van der Vyver C

    2011-09-01

    Full Text Available Christell van der Vyver1, B Juan Vorster2, Karl J Kunert3, Christopher A Cullis41Institute for Plant Biotechnology, Department of Genetics, University of Stellenbosch, Stellenbosch, South Africa; 2Department of Plant Production and Soil Science, and 3Department of Plant Science, Forestry and Agricultural Biotechnology Institute, University of Pretoria, Pretoria, South Africa; 4Case Western Reserve University, Department of Biology, Cleveland, OH, USAAbstract: Seeds from an inbred Vigna unguiculata (cowpea cultivar were gamma-irradiated with a dose of 180 Gy in order to identify and characterize possible mutations. Three techniques, ie, random amplified polymorphic DNA, microsatellites, and representational difference analysis, were used to characterize possible DNA variation among the mutants and nonirradiated control plants both immediately after irradiation and in subsequent generations. A large portion of putative radiation-induced genome changes had significant similarities to chloroplast sequences. The frequency of mutation at three of these isolated polymorphic regions with chloroplast similarity was further determined by polymerase chain reaction screening using a large number of individual parental, M1, and M2 plants. Analysis of these sequences indicated that the rate at which various regions of the genome is mutated in irradiation experiments differs significantly and also that mutations have variable “repair” rates. Furthermore, regions of the nuclear DNA derived from the chloroplast genome are highly susceptible to modification by radiation treatment. Overall, data have provided detailed information on the effects of gamma irradiation on the cowpea genome and about the ability of the plant to repair these genome changes in subsequent plant generations.Keywords: mutation breeding, gamma radiation, genetic mutations, cowpea, representational difference analysis

  3. Physiological genomics analysis for Alzheimer′s disease

    Directory of Open Access Journals (Sweden)

    Viroj Wiwanitkit

    2013-01-01

    Full Text Available Alzheimer′s disease is a common kind of dementia. This disorder can be detected in all countries around the world. This neurological disorder affects millions of population and becomes an important concern in modern neurology. There are many researches on the pathogenesis of Alzheimer′s disease. Although it has been determined for a long time, there is no clear-cut that this is a case with genetic disorder or not. A physiological genomics is a new application that is useful for track function to genes within the human genome and can be applied for answering the problem of underlying pathobiology of complex diseases. The physiogenomics can be helpful for study of systemic approach on the pathophysiology, and genomics might provide useful information to better understand the pathogenesis of Alzheimer′s disease. The present advent in genomics technique makes it possible to trace for the underlying genomics of disease. In this work, physiological genomics analysis for Alzheimer′s disease was performed. The standard published technique is used for assessment. According to this work, there are 20 identified physiogenomics relationship on several chromosomes. Considering the results, the HADH2 gene on chromosome X, APBA1 gene on chromosome 9, AGER gene on chromosome 6, GSK3B gene on chromosome 3, CDKHR1 gene on chromosome 17, APPBP1 gene on chromosome 16, APBA2 gene on chromosome 15, GAL gene on chromosome 11, and APLP2 gene on chromosome 11 have the highest physiogenomics score (9.26 while the CASP3 gene on chromosome 4 and the SNCA gene on chromosome 4 have the lowest physiogenomics score (7.44. The results from this study confirm that Alzheimer′s disease has a polygenomic origin.

  4. Genome Sequence and Comparative Genomics Analysis of a Vibrio cholerae O1 Strain Isolated from a Cholera Patient in Malaysia

    Science.gov (United States)

    Osama, Abdulrazak; Gan, Han Ming; Teh, Cindy Shuan Ju; Yap, Kien-Pong

    2012-01-01

    The genome sequence analysis of a clinical Vibrio cholerae VC35 strain from an outbreak case in Malaysia indicates multiple genes involved in host adaptation and a novel Na+-driven multidrug efflux pump-coding gene in the genome of Vibrio cholerae with the highest similarity to VMA_001754 of Vibrio mimicus VMA223. PMID:23209200

  5. Genome Sequence and Comparative Genomics Analysis of a Vibrio cholerae O1 Strain Isolated from a Cholera Patient in Malaysia

    OpenAIRE

    Osama, Abdulrazak; Gan, Han Ming; Teh, Cindy Shuan Ju; Yap, Kien-Pong; Thong, Kwai-Lin

    2012-01-01

    The genome sequence analysis of a clinical Vibrio cholerae VC35 strain from an outbreak case in Malaysia indicates multiple genes involved in host adaptation and a novel Na+-driven multidrug efflux pump-coding gene in the genome of Vibrio cholerae with the highest similarity to VMA_001754 of Vibrio mimicus VMA223.

  6. Comparative Genome Analysis of Basidiomycete Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  7. BIGSdb: Scalable analysis of bacterial genome variation at the population level

    Directory of Open Access Journals (Sweden)

    Maiden Martin CJ

    2010-12-01

    Full Text Available Abstract Background The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. Results The Bacterial Isolate Genome Sequence Database (BIGSDB is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus. The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/. Conclusions Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB

  8. Pre-Steady-State Kinetic Analysis of Single-Nucleotide Incorporation by DNA Polymerases.

    Science.gov (United States)

    Su, Yan; Peter Guengerich, F

    2016-01-01

    Pre-steady-state kinetic analysis is a powerful and widely used method to obtain multiple kinetic parameters. This protocol provides a step-by-step procedure for pre-steady-state kinetic analysis of single-nucleotide incorporation by a DNA polymerase. It describes the experimental details of DNA substrate annealing, reaction mixture preparation, handling of the RQF-3 rapid quench-flow instrument, denaturing polyacrylamide DNA gel preparation, electrophoresis, quantitation, and data analysis. The core and unique part of this protocol is the rationale for preparation of the reaction mixture (the ratio of the polymerase to the DNA substrate) and methods for conducting pre-steady-state assays on an RQF-3 rapid quench-flow instrument, as well as data interpretation after analysis. In addition, the methods for the DNA substrate annealing and DNA polyacrylamide gel preparation, electrophoresis, quantitation and analysis are suitable for use in other studies. © 2016 by John Wiley & Sons, Inc. PMID:27248785

  9. Tomato Functional Genomics Database: a comprehensive resource and analysis package for tomato functional genomics.

    Science.gov (United States)

    Fei, Zhangjun; Joung, Je-Gun; Tang, Xuemei; Zheng, Yi; Huang, Mingyun; Lee, Je Min; McQuinn, Ryan; Tieman, Denise M; Alba, Rob; Klee, Harry J; Giovannoni, James J

    2011-01-01

    Tomato Functional Genomics Database (TFGD) provides a comprehensive resource to store, query, mine, analyze, visualize and integrate large-scale tomato functional genomics data sets. The database is functionally expanded from the previously described Tomato Expression Database by including metabolite profiles as well as large-scale tomato small RNA (sRNA) data sets. Computational pipelines have been developed to process microarray, metabolite and sRNA data sets archived in the database, respectively, and TFGD provides downloads of all the analyzed results. TFGD is also designed to enable users to easily retrieve biologically important information through a set of efficient query interfaces and analysis tools, including improved array probe annotations as well as tools to identify co-expressed genes, significantly affected biological processes and biochemical pathways from gene expression data sets and miRNA targets, and to integrate transcript and metabolite profiles, and sRNA and mRNA sequences. The suite of tools and interfaces in TFGD allow intelligent data mining of recently released and continually expanding large-scale tomato functional genomics data sets. TFGD is available at http://ted.bti.cornell.edu. PMID:20965973

  10. A Genomic Analysis of Rat Proteases and Protease Inhibitors

    OpenAIRE

    Puente, Xose S.; López-Otín, Carlos

    2004-01-01

    Proteases perform important roles in multiple biological and pathological processes. The availability of the rat genome sequence has facilitated the analysis of the complete protease repertoire or degradome of this model organism. The rat degradome consists of at least 626 proteases and homologs, which are distributed into 24 aspartic, 160 cysteine, 192 metallo, 221 serine, and 29 threonine proteases. This distribution is similar to that of the mouse degradome but is more complex than that of...

  11. GATB: a software toolbox for genome assembly and analysis

    OpenAIRE

    Drezen, Erwan; Rizk, Guillaume; Chikhi, Rayan; Deltel, Charles; Lemaitre, Claire; Peterlongo, Pierre; Lavenier, Dominique

    2014-01-01

    International audience The analysis of NGS data remains a time and space-consuming task. Many efforts have been made to provide efficient data structures for indexing the terabytes of data generated by the fast sequencing machines (Suffix Array, Burrows-Wheeler transform, Bloom Filter, etc.). Mapper tools, genome assemblers, SNP callers, etc., make an intensive use of these data structures to keep their memory footprint as lower as possible.The overall efficiency of NGS software is brought...

  12. SIDEKICK: Genomic data driven analysis and decision-making framework

    Directory of Open Access Journals (Sweden)

    Yoon Kihoon

    2010-12-01

    Full Text Available Abstract Background Scientists striving to unlock mysteries within complex biological systems face myriad barriers in effectively integrating available information to enhance their understanding. While experimental techniques and available data sources are rapidly evolving, useful information is dispersed across a variety of sources, and sources of the same information often do not use the same format or nomenclature. To harness these expanding resources, scientists need tools that bridge nomenclature differences and allow them to integrate, organize, and evaluate the quality of information without extensive computation. Results Sidekick, a genomic data driven analysis and decision making framework, is a web-based tool that provides a user-friendly intuitive solution to the problem of information inaccessibility. Sidekick enables scientists without training in computation and data management to pursue answers to research questions like "What are the mechanisms for disease X" or "Does the set of genes associated with disease X also influence other diseases." Sidekick enables the process of combining heterogeneous data, finding and maintaining the most up-to-date data, evaluating data sources, quantifying confidence in results based on evidence, and managing the multi-step research tasks needed to answer these questions. We demonstrate Sidekick's effectiveness by showing how to accomplish a complex published analysis in a fraction of the original time with no computational effort using Sidekick. Conclusions Sidekick is an easy-to-use web-based tool that organizes and facilitates complex genomic research, allowing scientists to explore genomic relationships and formulate hypotheses without computational effort. Possible analysis steps include gene list discovery, gene-pair list discovery, various enrichments for both types of lists, and convenient list manipulation. Further, Sidekick's ability to characterize pairs of genes offers new ways to

  13. JBrowse: a dynamic web platform for genome visualization and analysis

    OpenAIRE

    Buels, Robert; Yao, Eric; Diesh, Colin M.; Hayes, Richard D; Munoz-Torres, Monica; Helt, Gregg; Goodstein, David M.; Christine G. Elsik; Lewis, Suzanna E.; Stein, Lincoln; Holmes, Ian H.

    2016-01-01

    Background JBrowse is a fast and full-featured genome browser built with JavaScript and HTML5. It is easily embedded into websites or apps but can also be served as a standalone web page. Results Overall improvements to speed and scalability are accompanied by specific enhancements that support complex interactive queries on large track sets. Analysis functions can readily be added using the plugin framework; most visual aspects of tracks can also be customized, along with clicks, mouseovers,...

  14. Genome Sequencing and Analysis of BCG Vaccine Strains

    OpenAIRE

    Zhang, Wen; Zhang, Yuanyuan; Zheng, Huajun; Pan, Yuanlong; Liu, Haican; Du, Pengcheng; Wan, Li; LIU Jun; Zhu, Baoli; Zhao, Guoping; Chen, Chen; Wan, Kanglin

    2013-01-01

    Background Although the Bacillus Calmette-Guérin (BCG) vaccine against tuberculosis (TB) has been available for more than 75 years, one third of the world's population is still infected with Mycobacterium tuberculosis and approximately 2 million people die of TB every year. To reduce this immense TB burden, a clearer understanding of the functional genes underlying the action of BCG and the development of new vaccines are urgently needed. Methods and Findings Comparative genomic analysis of 1...

  15. Analysis of Chimpanzee History Based on Genome Sequence Alignments

    OpenAIRE

    Caswell, Jennifer L.; Richter, Daniel J.; Neubauer, Julie; Schirmer, Christine; Gnerre, Sante; Mallick, Swapan; Reich, David Emil

    2008-01-01

    Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously...

  16. Sequence analysis and editing for bisulphite genomic sequencing projects

    OpenAIRE

    Carr, IM; Valleley, EMA; Cordery, SF; Markham, AF; Bonthron, DT

    2007-01-01

    Bisulphite genomic sequencing is a widely used technique for detailed analysis of the methylation status of a region of DNA. It relies upon the selective deamination of unmethylated cytosine to uracil after treatment with sodium bisulphite, usually followed by PCR amplification of the chosen target region. Since this two-step procedure replaces all unmethylated cytosine bases with thymine, PCR products derived from unmethylated templates contain only three types of nucleotide, in unequal prop...

  17. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  18. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Science.gov (United States)

    Yazar, Seyhan; Gooden, George E C; Mackey, David A; Hewitt, Alex W

    2014-01-01

    A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR) on Amazon EC2 instances and Google Compute Engine (GCE), using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome) and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2) for E.coli and 53.5% (95% CI: 34.4-72.6) for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1) and 173.9% (95% CI: 134.6-213.1) more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE. PMID:25247298

  19. Privacy-preserving GWAS analysis on federated genomic datasets

    Science.gov (United States)

    2015-01-01

    Background The biomedical community benefits from the increasing availability of genomic data to support meaningful scientific research, e.g., Genome-Wide Association Studies (GWAS). However, high quality GWAS usually requires a large amount of samples, which can grow beyond the capability of a single institution. Federated genomic data analysis holds the promise of enabling cross-institution collaboration for effective GWAS, but it raises concerns about patient privacy and medical information confidentiality (as data are being exchanged across institutional boundaries), which becomes an inhibiting factor for the practical use. Methods We present a privacy-preserving GWAS framework on federated genomic datasets. Our method is to layer the GWAS computations on top of secure multi-party computation (MPC) systems. This approach allows two parties in a distributed system to mutually perform secure GWAS computations, but without exposing their private data outside. Results We demonstrate our technique by implementing a framework for minor allele frequency counting and χ2 statistics calculation, one of typical computations used in GWAS. For efficient prototyping, we use a state-of-the-art MPC framework, i.e., Portable Circuit Format (PCF) [1]. Our experimental results show promise in realizing both efficient and secure cross-institution GWAS computations. PMID:26733045

  20. Synonymous Codon Usage Analysis of Thirty Two Mycobacteriophage Genomes

    Directory of Open Access Journals (Sweden)

    Sameer Hassan

    2009-01-01

    Full Text Available Synonymous codon usage of protein coding genes of thirty two completely sequenced mycobacteriophage genomes was studied using multivariate statistical analysis. One of the major factors influencing codon usage is identified to be compositional bias. Codons ending with either C or G are preferred in highly expressed genes among which C ending codons are highly preferred over G ending codons. A strong negative correlation between effective number of codons (Nc and GC3s content was also observed, showing that the codon usage was effected by gene nucleotide composition. Translational selection is also identified to play a role in shaping the codon usage operative at the level of translational accuracy. High level of heterogeneity is seen among and between the genomes. Length of genes is also identified to influence the codon usage in 11 out of 32 phage genomes. Mycobacteriophage Cooper is identified to be the highly biased genome with better translation efficiency comparing well with the host specific tRNA genes.

  1. Pan-Genome Analysis of Brazilian Lineage A Amoebal Mimiviruses

    Directory of Open Access Journals (Sweden)

    Felipe L. Assis

    2015-06-01

    Full Text Available Since the recent discovery of Samba virus, the first representative of the family Mimiviridae from Brazil, prospecting for mimiviruses has been conducted in different environmental conditions in Brazil. Recently, we isolated using Acanthamoeba sp. three new mimiviruses, all of lineage A of amoebal mimiviruses: Kroon virus from urban lake water; Amazonia virus from the Brazilian Amazon river; and Oyster virus from farmed oysters. The aims of this work were to sequence and analyze the genome of these new Brazilian mimiviruses (mimi-BR and update the analysis of the Samba virus genome. The genomes of Samba virus, Amazonia virus and Oyster virus were 97%–99% similar, whereas Kroon virus had a low similarity (90%–91% with other mimi-BR. A total of 3877 proteins encoded by mimi-BR were grouped into 974 orthologous clusters. In addition, we identified three new ORFans in the Kroon virus genome. Additional work is needed to expand our knowledge of the diversity of mimiviruses from Brazil, including if and why among amoebal mimiviruses those of lineage A predominate in the Brazilian environment.

  2. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Directory of Open Access Journals (Sweden)

    Seyhan Yazar

    Full Text Available A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR on Amazon EC2 instances and Google Compute Engine (GCE, using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2 for E.coli and 53.5% (95% CI: 34.4-72.6 for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1 and 173.9% (95% CI: 134.6-213.1 more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  3. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  4. Genome sequencing and analysis of BCG vaccine strains.

    Directory of Open Access Journals (Sweden)

    Wen Zhang

    Full Text Available BACKGROUND: Although the Bacillus Calmette-Guérin (BCG vaccine against tuberculosis (TB has been available for more than 75 years, one third of the world's population is still infected with Mycobacterium tuberculosis and approximately 2 million people die of TB every year. To reduce this immense TB burden, a clearer understanding of the functional genes underlying the action of BCG and the development of new vaccines are urgently needed. METHODS AND FINDINGS: Comparative genomic analysis of 19 M. tuberculosis complex strains showed that BCG strains underwent repeated human manipulation, had higher region of deletion rates than those of natural M. tuberculosis strains, and lost several essential components such as T-cell epitopes. A total of 188 BCG strain T-cell epitopes were lost to various degrees. The non-virulent BCG Tokyo strain, which has the largest number of T-cell epitopes (359, lost 124. Here we propose that BCG strain protection variability results from different epitopes. This study is the first to present BCG as a model organism for genetics research. BCG strains have a very well-documented history and now detailed genome information. Genome comparison revealed the selection process of BCG strains under human manipulation (1908-1966. CONCLUSIONS: Our results revealed the cause of BCG vaccine strain protection variability at the genome level and supported the hypothesis that the restoration of lost BCG Tokyo epitopes is a useful future vaccine development strategy. Furthermore, these detailed BCG vaccine genome investigation results will be useful in microbial genetics, microbial engineering and other research fields.

  5. MultiMetEval: comparative and multi-objective analysis of genome-scale metabolic models.

    Directory of Open Access Journals (Sweden)

    Piotr Zakrzewski

    Full Text Available Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the context of multiple cellular objectives. Here, we present the user-friendly software framework Multi-Metabolic Evaluator (MultiMetEval, built upon SurreyFBA, which allows the user to compose collections of metabolic models that together can be subjected to flux balance analysis. Additionally, MultiMetEval implements functionalities for multi-objective analysis by calculating the Pareto front between two cellular objectives. Using a previously generated dataset of 38 actinobacterial genome-scale metabolic models, we show how these approaches can lead to exciting novel insights. Firstly, after incorporating several pathways for the biosynthesis of natural products into each of these models, comparative flux balance analysis predicted that species like Streptomyces that harbour the highest diversity of secondary metabolite biosynthetic gene clusters in their genomes do not necessarily have the metabolic network topology most suitable for compound overproduction. Secondly, multi-objective analysis of biomass production and natural product biosynthesis in these actinobacteria shows that the well-studied occurrence of discrete metabolic switches during the change of cellular objectives is inherent to their metabolic network architecture. Comparative and multi-objective modelling can lead to insights that could not be obtained by normal flux balance analyses. MultiMetEval provides a powerful platform that makes these analyses straightforward for biologists. Sources and binaries of MultiMetEval are freely available from https://github.com/PiotrZakrzewski/MetEval/downloads.

  6. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  7. Genomic cluster and network analysis for predictive screening for hepatotoxicity.

    Science.gov (United States)

    Fukushima, Tamio; Kikkawa, Rie; Hamada, Yoshimasa; Horii, Ikuo

    2006-12-01

    The present study was undertaken to estimate the usefulness of genomic approaches to predict hepatotoxicity. Male rats were treated with acetaminophen (APAP), carbon tetrachloride (CCL), amiodarone (AD) or tetracycline (TC) at toxic doses. Their livers were extracted 6 or 24 hr after the dosings and were used for subsequent examinations. At 6 hr there were no histological changes noted in any of the groups except for the CCL group, but at 24 hr, such changes were noted in all but the AD group. Regarding genomic analysis, we performed hierarchical cluster analysis using S-plus software. The individual microarray data were clearly classified into 5 treatment-related clusters at 24 hr as well as at 6 hr, even though no morphological changes were noted at 6 hr. In the gene expression analysis using GeneSpring, transcription factor and oxidative stress- and lipid metabolism-related genes were markedly affected in all treatment groups at both time points when compared with the corresponding control values. Finally, we investigated gene networks in the above-affected genes by using Ingenuity Pathway Analysis software. Down-regulation of lipid metabolism-related genes regulated by SREBP1 was observed in all treatment groups at both time points, and up-regulation of oxidative stress-related genes regulated by Nrf2 was observed in the APAP and CCL treatment groups. From the above findings, for the application of genomic approaches to predict hepatotoxicity, we considered that cluster analysis for classification and early prediction of hepatotoxicity and network analysis for investigation of toxicological biomarkers would be useful. PMID:17202758

  8. Genome-wide Analysis of Ovate Family Proteins in Arabidopsis

    Institute of Scientific and Technical Information of China (English)

    Huang Jian-ping; Li Hong-ling; Chang Ying

    2012-01-01

    Arabidopsis thaliana ovate family proteins (AtOFPs) is a newly found plant-specific protein family interacting with TALE (3-aa loop extension homeodomain proteins) homeodomain proteins in Arabidopsis. Here, based on bioinformatic analysis, we found that Arabidopsis genome actually encoded 17 OVATE domain-containing proteins. One of them, AtOFP19, has not been previously identified. Based on their amino acid sequence similarity, AtOFPs proteins can be divided into two groups. Most of the AtOFPs were located in nuclear, four of them were presented in chloroplast and the remaining two members appeared in cytoplasmic. A genome- wide microarray based gene expression analysis involving 47 stages of vegetative and reproductive development revealed that AtOFPs have diverse expression pattems. Investigation of proteins interaction showed that nine AtOFPs only interacted with TALE homeodomain proteins, which are fundamental regulators of plant meristem function and leaf development. Our work could provide important leads toward functional genomics studies of ovate family proteins, which may be involved in a previously unrecognized control mechanism in plant development

  9. Analysis of the ABCA4 genomic locus in Stargardt disease

    DEFF Research Database (Denmark)

    Zernant, Jana; Xie, Yajing Angela; Ayuso, Carmen;

    2014-01-01

    was designed to find the missing disease-causing ABCA4 variation by a combination of next-generation sequencing (NGS), array-Comparative Genome Hybridization (aCGH) screening, familial segregation and in silico analyses. The entire 140 kb ABCA4 genomic locus was sequenced in 114 STGD patients with one...... once. Multimodal analysis suggested 12 new likely pathogenic intronic ABCA4 variants, some of which were specific to (isolated) ethnic groups. No copy number variation (large deletions and insertions) was detected in any patient suggesting that it is a very rare event in the ABCA4 locus. Many variants...... were excluded since they were not conserved in non-human primates, were frequent in African populations and, therefore, represented ancestral, and not disease-associated, variants. The sequence variability in the ABCA4 locus is extensive and the non-coding sequences do not harbor frequent mutations in...

  10. Nanopatterned structures for biomolecular analysis toward genomic and proteomic applications

    Science.gov (United States)

    Chou, Chia-Fu; Gu, Jian; Wei, Qihuo; Liu, Yingjie; Gupta, Ravi; Nishio, Takeyoshi; Zenhausern, Frederic

    2005-01-01

    We report our fabrication of nanoscale devices using electron beam and nanoimprint lithography (NIL). We focus our study in the emerging fields of NIL, nanophotonics and nanobiotechnology and give a few examples as to how these nanodevices may be applied toward genomic and proteomic applications for molecular analysis. The examples include reverse NIL-fabricated nanofluidic channels for DNA stretching, nanoscale molecular traps constructed from dielectric constrictions for DNA or protein focusing by dielectrophoresis, multi-layer nanoburger and nanoburger multiplets for optimized surface-plasma enhanced Raman scattering for protein detection, and biomolecular motor-based nanosystems. The development of advanced nanopatterning techniques promises reliable and high-throughput manufacturing of nanodevices which could impact significantly on the areas of genomics, proteomics, drug discovery and molecular clinical diagnostics.

  11. Comparative analysis of cytogenetic manifestations of human genome instability

    International Nuclear Information System (INIS)

    The comparative analysis of cytogenetic manifestations of human genome instability was carried out. The studied parameters are the micronuclei rate (MNR), the level of single and double chromosome fragment and the level of premature chromatid division (PCD). PCD and chromosome fragments were chosen as anomalies that possibly result in MN formation. We analysed the MNR in buccal epithelium (BE) and peripheral blood lymphocytes (PBL), the level of single and double chromosome fragment as well as level PCD - in PBL only. Average MNR in BE was higher than in PBL. The studied parameters are independent ones and have to be considered altogether for more comprehensive evaluation of the level and peculiarities of manifestation of human genome instability

  12. Cancer Genome Atlas Pan-cancer Analysis Project

    Directory of Open Access Journals (Sweden)

    Kun ZHANG

    2015-04-01

    Full Text Available Cancer can exhibit different forms depending on the site of origin, cell types, the different forms of genetic mutations which also affect cancer therapeutic effect. Although many genes have been demonstrated to change a direct result of the change in phenotype, however, many cancers lineage complex molecular mechanisms are still not fully elucidated. Therefore, The Cancer Genome Atlas (TCGA Research Network analyzed a large human tumors, in order to find the molecular changes in DNA, RNA, protein and epigenetic level, The results contain a wealth of data provides us with an opportunity for common, personality and new ideas throughout the cancer lineages form a whole description. Pan-cancer genome program first compares the 12 kinds of cancer types. Analysis of different tumor molecular changes and their functions, will tell us how effective treatment method is applied to a similar phenotype of the tumor.

  13. [Cancer Genome Atlas Pan-cancer Analysis Project].

    Science.gov (United States)

    Zhang, Kun; Wang, Hong

    2015-04-01

    Cancer can exhibit different forms depending on the site of origin, cell types, the different forms of genetic mutations which also affect cancer therapeutic effect. Although many genes have been demonstrated to change a direct result of the change in phenotype, however, many cancers lineage complex molecular mechanisms are still not fully elucidated. Therefore, The Cancer Genome Atlas (TCGA) Research Network analyzed a large human tumors, in order to find the molecular changes in DNA, RNA, protein and epigenetic level, The results contain a wealth of data provides us with an opportunity for common, personality and new ideas throughout the cancer lineages form a whole description. Pan-cancer genome program first compares the 12 kinds of cancer types. Analysis of different tumor molecular changes and their functions, will tell us how effective treatment method is applied to a similar phenotype of the tumor. PMID:25936886

  14. Genome analysis of enterovirus 71 strains differing in mouse pathogenicity.

    Science.gov (United States)

    Li, Peng; Yue, Yingying; Song, Nannan; Li, Bingqing; Meng, Hong; Yang, Guiwen; Li, Zhihui; An, Liguo; Qin, Lizeng

    2016-04-01

    Enterovirus 71 (EV71) is a major causative agent of hand, foot, and mouth disease (HFMD) and is occasionally associated with severe neurological diseases. The investigation of virulence determinants of EV71 is rudimentary. Therefore, it is important to understand the relationship between EV71 virulence and genomic information. In this study, a series of analyses about full-length genomic sequence were performed on six EV71 strains isolated from HFMD patients with either severe or mild clinical symptoms. A one-day-old BALB/c mouse model was used to study the infection characteristics. Results showed all six strains were of the subgenogroup C4a. Viral full-length genomic sequence analysis showed that a total of 40 nucleotide differences between strains of highly and low virulence were revealed. Among all mutations, three nucleotide mutations were found in the untranslated region. A mutation, nt115, at internal ribozyme entry site (IRES) caused RNA secondary structural change. The other 37 mutations were all located in the open reading frame resulting in 8 amino acid mutations. Importantly, we discovered that a mutation of amino acid (Asn1617 → Asp1617) in the 3C proteinase (3C(pro)) of highly and low pathogenic strains could lead to conformational change at the active center, suggesting that this site may be a virulence determinant of EV71. PMID:26781949

  15. Bioinformatics analysis of rabbit haemorrhagic disease virus genome

    Directory of Open Access Journals (Sweden)

    Liu Ji-xing

    2011-11-01

    Full Text Available Abstract Background Rabbit haemorrhagic disease virus (RHDV, as the pathogeny of Rabbit haemorrhagic disease, can cause a highly infectious and often fatal disease only affecting wild and domestic rabbits. Recent researches revealed that it, as one number of the Caliciviridae, has some specialties in its genome, its reproduction and so on. Results In this report, we firstly analyzed its genome and two open reading frameworks (ORFs from this aspect of codon usage bias. Our researches indicated that mutation pressure rather than natural is the most important determinant in RHDV with high codon bias, and the codon usage bias is nearly contrary between ORF1 and ORF2, which is maybe one of factors regulating the expression of VP60 (encoding by ORF1 and VP10 (encoding by ORF2. Furthermore, negative selective constraints on the RHDV whole genome implied that VP10 played an important role in RHDV lifecycle. Conclusions We conjectured that VP10 might be beneficial for the replication, release or both of virus by inducing infected cell apoptosis initiate by RHDV. According to the results of the principal component analysis for ORF2 of RSCU, we firstly separated 30 RHDV into two genotypes, and the ENC values indicated ORF1 and ORF2 were independent among the evolution of RHDV.

  16. Quantifying element incorporation in multispecies biofilms using nanoscale secondary ion mass spectrometry image analysis.

    Science.gov (United States)

    Renslow, Ryan S; Lindemann, Stephen R; Cole, Jessica K; Zhu, Zihua; Anderton, Christopher R

    2016-06-01

    Elucidating nutrient exchange in microbial communities is an important step in understanding the relationships between microbial systems and global biogeochemical cycles, but these communities are complex and the interspecies interactions that occur within them are not well understood. Phototrophic consortia are useful and relevant experimental systems to investigate such interactions as they are not only prevalent in the environment, but some are cultivable in vitro and amenable to controlled scientific experimentation. Nanoscale secondary ion mass spectrometry (NanoSIMS) is a powerful, high spatial resolution tool capable of visualizing the metabolic activities of single cells within a biofilm, but quantitative analysis of the resulting data has typically been a manual process, resulting in a task that is both laborious and susceptible to human error. Here, the authors describe the creation and application of a semiautomated image-processing pipeline that can analyze NanoSIMS-generated data, applied to phototrophic biofilms as an example. The tool employs an image analysis process, which includes both elemental and morphological segmentation, producing a final segmented image that allows for discrimination between autotrophic and heterotrophic biomass, the detection of individual cyanobacterial filaments and heterotrophic cells, the quantification of isotopic incorporation of individual heterotrophic cells, and calculation of relevant population statistics. The authors demonstrate the functionality of the tool by using it to analyze the uptake of (15)N provided as either nitrate or ammonium through the unicyanobacterial consortium UCC-O and imaged via NanoSIMS. The authors found that the degree of (15)N incorporation by individual cells was highly variable when labeled with (15)NH4 (+), but much more even when biofilms were labeled with (15)NO3 (-). In the (15)NH4 (+)-amended biofilms, the heterotrophic distribution of (15)N incorporation was highly skewed, with

  17. Quantifying element incorporation in multispecies biofilms using nanoscale secondary ion mass spectrometry image analysis

    Energy Technology Data Exchange (ETDEWEB)

    Renslow, Ryan S.; Lindemann, Stephen R.; Cole, Jessica K.; Zhu, Zihua; Anderton, Christopher R.

    2016-02-12

    EElucidating nutrient exchange in microbial communities is an important step in understanding the relationships between microbial systems and global biogeochemical cycles, but these communities are complex and the interspecies interactions that occur within them are not well understood. Phototrophic consortia are useful and relevant experimental systems to investigate such interactions as they are not only prevalent in the environment, but some are cultivable in vivo and amenable to controlled scientific experimentation. High spatial resolution secondary ion mass spectrometry (NanoSIMS) is a powerful tool capable of visualizing the metabolic activities of single cells within a biofilm, but quantitative analysis of the resulting data has typically been a manual process, resulting in a task that is both laborious and susceptible to human error. Here, we describe the creation and application of a semi-automated image-processing pipeline that can analyze NanoSIMS-generated data of phototrophic biofilms. The tool employs an image analysis process, which includes both elemental and morphological segmentation, producing a final segmented image that allows for discrimination between autotrophic and heterotrophic biomass, the detection of individual cyanobacterial filaments and heterotrophic cells, the quantification of isotopic incorporation of individual heterotrophic cells, and calculation of relevant population statistics. We demonstrate the functionality of the tool by using it to analyze the uptake of 15N provided as either nitrate or ammonium through the unicyanobacterial consortium UCC-O and imaged via NanoSIMS. We found that the degree of 15N incorporation by individual cells was highly variable when labeled with 15NH4 +, but much more even when biofilms were labeled with 15NO3-. In the 15NH4 +-amended biofilms, the heterotrophic distribution of 15N incorporation was highly skewed, with a large population showing moderate 15N incorporation and a small number of

  18. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    DEFF Research Database (Denmark)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger;

    2016-01-01

    describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F......, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which...

  19. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    Directory of Open Access Journals (Sweden)

    Cassidy L Klima

    Full Text Available Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1 and 6 (S6 isolated from pneumonic lesions and serotype 2 (S2 found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design

  20. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    Science.gov (United States)

    Klima, Cassidy L; Cook, Shaun R; Zaheer, Rahat; Laing, Chad; Gannon, Vick P; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W; McAllister, Tim A

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  1. Comparative analysis of whole-genome sequences of Streptococcus suis

    Institute of Scientific and Technical Information of China (English)

    LI Pengli; WEI Wu; LI Yixue; MA Yuanyuan; DING Guohui; LI Xiaoping; WANG Xiaojing; ZHANG Liwen; SUN Jingchun; WANG Yong; TU Kang; WANG Ningning; HAO Pei; WANG Chuan; CAO Zhiwei; SHI Tieliu

    2006-01-01

    The outbreak of Streptococcus suis recently in some districts of Sichuan Province in China has caused over 30 deaths and over 200 infections in human beings. In order to study the pathogenicity mechanism and to prevent the bacteria from spreading and infecting human beings and swine, we have annotated and analyzed the genomes of two strains, Streptococcus suis P1/7 and 89-1591 respectively. The whole length of P1/7 is 2.007 Mb,and has 1969 ORFs. In contrast, the partial genome sequence of 89-1591 is 1.98 Mb in length and exists in 177 contigs with 1918 ORFs. Analysis shows that the average lengths of CDSs in two genomes are very close, and the numbers of the homolog ORFs are 1306 between those two strains. Most of the toxicity factors of the two strains are homologeous, but there are still some significant differences between those two strains. For example, among the 11 genes (cps2A-cps2K) encoding for the capsules in P1/7, 4(cps2A, 2B, 2I, 2J) are not detected in strain 89-1591.At the same time, the genes encoding EF and Haemolysin in P1/7 are also not found in strain 89-1591. Besides, the genes related to DNA replication, repair and recombination differ from each other significantly and there also exist certain differences among the surface proteins. Those characteristics indicate that those two strains have evolved their own specific functions to adapt to the different environments and that the pathogenesis of the two strains is different. We have accumulated comprehensive genomics information for future systematic studies of S.sui. Our results are helpful for disease prevention,vaccine development, as well as drug design for S.suis.

  2. Evolutionary insights from suffix array-based genome sequence analysis

    Indian Academy of Sciences (India)

    Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan

    2007-08-01

    Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a ‘meaning’ for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG, coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.

  3. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    Science.gov (United States)

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites. PMID:26259813

  4. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    OpenAIRE

    Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Eisen, Jonathan A.; Peterson, Scott; Wessels, Michael R.; Paulsen, Ian T.; Nelson, Karen E.; Margarit, Immaculada; Read, Timothy D.; Madoff, Lawrence C.; Wolf, Alex M.; Beanan, Maureen J; Brinkac, Lauren M.; Sean C Daugherty

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined with comparative genome hybridization experiments between the ...

  5. Comparative genome analysis of Bacillus cereus group genomes with Bacillus subtilis

    OpenAIRE

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch, Gordon; Liolios, Konstantinos; Grechkin, Yuri

    2005-01-01

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-...

  6. Paired-end genomic signature tags: a method for the functional analysis of genomes and epigenomes.

    Science.gov (United States)

    Dunn, John J; McCorkle, Sean R; Everett, Logan; Anderson, Carl W

    2007-01-01

    Because paired-end genomic signature tags are sequenced-based, they have the potential to become an alternate tool to tiled microarray hybridization as a method for genome-wide localization of transcription factors and other sequence-specific DNA binding proteins. As outlined here the method also can be used for global analysis of DNA methylation. One advantage of this approach is the ability to easily switch between different genome types without having to fabricate a new microarray for each and every DNA type. However, the method does have some disadvantages. Among the most rate-limiting steps of our PE-GST protocol are the need to concatemerize the diTAGs, size fractionate them and then clone them prior to sequencing. This is usually followed by additional steps to amplify and size select for long (> or = 500) concatemer inserts prior to sequencing. These time-consuming steps are important for standard DNA sequencing as they increase efficiency approximately 20-30-fold since each amplified concatemer can now provide information on multiple tags; the limitation on data acqui- sition is read length during sequencing. However, the development of new sequencing methods such as Life Sciences' 454 new nanotechnology-based sequencing instrument (41) could increase tag sequencing efficiency by several orders of magnitude (> or = 100,000 diTAG reads/run), which is sufficient to provide in-depth global analysis of all ChIP PE-GSTs in a single run. This is because the lengths of our paired-end diTAGs (approximately 60 bp) fall well within the region of high accuracy for read lengths on this instrument. In principle, sequence analysis of diTAGs could begin as soon as they are generated, thereby completely bypassing the need for the concatemerization, sizing, downstream cloning steps and sequencing template purification. In addition, our protocol places any one of several unique four-base long nucleotide sequences, such as GATC, between each and every diTAG pair, which could

  7. Recombination analysis based on the complete genome of bocavirus

    Directory of Open Access Journals (Sweden)

    Chen Shengxia

    2011-04-01

    Full Text Available Abstract Bocavirus include bovine parvovirus, minute virus of canine, porcine bocavirus, gorilla bocavirus, and Human bocaviruses 1-4 (HBoVs. Although recent reports showed that recombination happened in bocavirus, no systematical study investigated the recombination of bocavirus. The present study performed the phylogenetic and recombination analysis of bocavirus over the complete genomes available in GenBank. Results confirmed that recombination existed among bocavirus, including the likely inter-genotype recombination between HBoV1 and HBoV4, and intra-genotype recombination among HBoV2 variants. Moreover, it is the first report revealing the recombination that occurred between minute viruses of canine.

  8. Systematic analysis of alternative first exons in plant genomes

    Directory of Open Access Journals (Sweden)

    Zeng Changqing

    2007-10-01

    Full Text Available Abstract Background Alternative splicing (AS contributes significantly to protein diversity, by selectively using different combinations of exons of the same gene under certain circumstances. One particular type of AS is the use of alternative first exons (AFEs, which can have consequences far beyond the fine-tuning of protein functions. For example, AFEs may change the N-termini of proteins and thereby direct them to different cellular compartments. When alternative first exons are distant, they are usually associated with alternative promoters, thereby conferring an extra level of gene expression regulation. However, only few studies have examined the patterns of AFEs, and these analyses were mainly focused on mammalian genomes. Recent studies have shown that AFEs exist in the rice genome, and are regulated in a tissue-specific manner. Our current understanding of AFEs in plants is still limited, including important issues such as their regulation, contribution to protein diversity, and evolutionary conservation. Results We systematically identified 1,378 and 645 AFE-containing clusters in rice and Arabidopsis, respectively. From our data sets, we identified two types of AFEs according to their genomic organisation. In genes with type I AFEs, the first exons are mutually exclusive, while most of the downstream exons are shared among alternative transcripts. Conversely, in genes with type II AFEs, the first exon of one gene structure is an internal exon of an alternative gene structure. The functionality analysis indicated about half and ~19% of the AFEs in Arabidopsis and rice could alter N-terminal protein sequences, and ~5% of the functional alteration in type II AFEs involved protein domain addition/deletion in both genomes. Expression analysis indicated that 20~66% of rice AFE clusters were tissue- and/or development- specifically transcribed, which is consistent with previous observations; however, a much smaller percentage of Arabidopsis

  9. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  10. Identification of Candidate Adherent-Invasive E. coli Signature Transcripts by Genomic/Transcriptomic Analysis.

    Directory of Open Access Journals (Sweden)

    Yuanhao Zhang

    Full Text Available Adherent-invasive Escherichia coli (AIEC strains are detected more frequently within mucosal lesions of patients with Crohn's disease (CD. The AIEC phenotype consists of adherence and invasion of intestinal epithelial cells and survival within macrophages of these bacteria in vitro. Our aim was to identify candidate transcripts that distinguish AIEC from non-invasive E. coli (NIEC strains and might be useful for rapid and accurate identification of AIEC by culture-independent technology. We performed comparative RNA-Sequence (RNASeq analysis using AIEC strain LF82 and NIEC strain HS during exponential and stationary growth. Differential expression analysis of coding sequences (CDS homologous to both strains demonstrated 224 and 241 genes with increased and decreased expression, respectively, in LF82 relative to HS. Transition metal transport and siderophore metabolism related pathway genes were up-regulated, while glycogen metabolic and oxidation-reduction related pathway genes were down-regulated, in LF82. Chemotaxis related transcripts were up-regulated in LF82 during the exponential phase, but flagellum-dependent motility pathway genes were down-regulated in LF82 during the stationary phase. CDS that mapped only to the LF82 genome accounted for 747 genes. We applied an in silico subtractive genomics approach to identify CDS specific to AIEC by incorporating the genomes of 10 other previously phenotyped NIEC. From this analysis, 166 CDS mapped to the LF82 genome and lacked homology to any of the 11 human NIEC strains. We compared these CDS across 13 AIEC, but none were homologous in each. Four LF82 gene loci belonging to clustered regularly interspaced short palindromic repeats region (CRISPR--CRISPR-associated (Cas genes were identified in 4 to 6 AIEC and absent from all non-pathogenic bacteria. As previously reported, AIEC strains were enriched for pdu operon genes. One CDS, encoding an excisionase, was shared by 9 AIEC strains. Reverse

  11. Dating the age of admixture via wavelet transform analysis of genome-wide data

    NARCIS (Netherlands)

    I. Pugach (Irina); R. Matveyev (Rostislav); A. Wollstein (Andreas); M.H. Kayser (Manfred); M. Stoneking (Mark)

    2011-01-01

    textabstractWe describe a PCA-based genome scan approach to analyze genome-wide admixture structure, and introduce wavelet transform analysis as a method for estimating the time of admixture. We test the wavelet transform method with simulations and apply it to genome-wide SNP data from eight admixe

  12. IMG 4 version of the integrated microbial genomes comparative analysis system

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chen, I-Min A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Palaniappan, Krishna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chu, Ken [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Szeto, Ernest [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Pillay, Manoj [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Ratner, Anna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Huang, Jinghua [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Woyke, Tanja [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Huntemann, Marcel [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Anderson, Iain [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Billis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Varghese, Neha [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Mavromatis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Pati, Amrita [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Ivanova, Natalia N. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  13. Comparative Genomics and Transcriptomic Analysis of Mycobacterium Kansasii

    KAUST Repository

    Alzahid, Yara

    2014-04-01

    The group of Mycobacteria is one of the most intensively studied bacterial taxa, as they cause the two historical and worldwide known diseases: leprosy and tuberculosis. Mycobacteria not identified as tuberculosis or leprosy complex, have been referred to by ‘environmental mycobacteria’ or ‘Nontuberculous mycobacteria (NTM). Mycobacterium kansasii (M. kansasii) is one of the most frequent NTM pathogens, as it causes pulmonary disease in immuno-competent patients and pulmonary, and disseminated disease in patients with various immuno-deficiencies. There have been five documented subtypes of this bacterium, by different molecular typing methods, showing that type I causes tuberculosis-like disease in healthy individuals, and type II in immune-compromised individuals. The remaining types are said to be environmental, thereby, not causing any diseases. The aim of this project was to conduct a comparative genomic study of M. kansasii types I-V and investigating the gene expression level of those types. From various comparative genomics analysis, provided genomics evidence on why M. kansasii type I is considered pathogenic, by focusing on three key elements that are involved in virulence of Mycobacteria: ESX secretion system, Phospholipase c (plcb) and Mammalian cell entry (Mce) operons. The results showed the lack of the espA operon in types II-V, which renders the ESX- 1 operon dysfunctional, as espA is one of the key factors that control this secretion system. However, gene expression analysis showed this operon to be deleted in types II, III and IV. Furthermore, plcB was found to be truncated in types III and IV. Analysis of Mce operons (1-4) show that mce-1 operon is duplicated, mce-2 is absent and mce-3 and mce-4 is present in one copy in M. kansasii types I-V. Gene expression profiles of type I-IV, showed that the secreted proteins of ESX-1 were slightly upregulated in types II-IV when compared to type I and the secreted forms of ESX-5 were highly down

  14. Measure representation and multifractal analysis of complete genomes.

    Science.gov (United States)

    Yu, Z G; Anh, V; Lau, K S

    2001-09-01

    This paper introduces the notion of measure representation of DNA sequences. Spectral analysis and multifractal analysis are then performed on the measure representations of a large number of complete genomes. The main aim of this paper is to discuss the multifractal property of the measure representation and the classification of bacteria. From the measure representations and the values of the D(q) spectra and related C(q) curves, it is concluded that these complete genomes are not random sequences. In fact, spectral analyses performed indicate that these measure representations, considered as time series, exhibit strong long-range correlation. Here the long-range correlation is for the K-strings with dictionary ordering, and it is different from the base pair correlations introduced by other people. For substrings with length K=8, the D(q) spectra of all organisms studied are multifractal-like and sufficiently smooth for the C(q) curves to be meaningful. With the decreasing value of K, the multifractality lessens. The C(q) curves of all bacteria resemble a classical phase transition at a critical point. But the "analogous" phase transitions of chromosomes of nonbacteria organisms are different. Apart from chromosome 1 of C. elegans, they exhibit the shape of double-peaked specific heat function. A classification of genomes of bacteria by assigning to each sequence a point in two-dimensional space (D(-1),D1) and in three-dimensional space (D(-1),D1,D(-2)) was given. Bacteria that are close phylogenetically are almost close in the spaces (D(-1),D1) and (D(-1),D1,D(-2)). PMID:11580363

  15. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Gadea Jose; Forment Javier; Santiago Julia; Marques M Carmen; Juarez Jose; Mauri Nuria; Martinez-Godoy M Angeles

    2008-01-01

    Abstract Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-...

  16. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

    2008-01-01

    Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA...

  17. Genome-wide analysis of DNA methylation in hepatoblastoma tissues

    Science.gov (United States)

    Cui, Ximao; Liu, Baihui; Zheng, Shan; Dong, Kuiran; Dong, Rui

    2016-01-01

    DNA methylation has a crucial role in cancer biology. In the present study, a genome-wide analysis of DNA methylation in hepatoblastoma (HB) tissues was performed to verify differential methylation levels between HB and normal tissues. As alpha-fetoprotein (AFP) has a critical role in HB, AFP methylation levels were also detected using pyrosequencing. Normal and HB liver tissue samples (frozen tissue) were obtained from patients with HB. Genome-wide analysis of DNA methylation in these tissues was performed using an Infinium HumanMethylation450 BeadChip, and the results were confirmed with reverse transcription-quantitative polymerase chain reaction. The Infinium HumanMethylation450 BeadChip demonstrated distinctively less methylation in HB tissues than in non-tumor tissues. In addition, methylation enrichment was observed in positions near the transcription start site of AFP, which exhibited lower methylation levels in HB tissues than in non-tumor liver tissues. Lastly, a significant negative correlation was observed between AFP messenger RNA expression and DNA methylation percentage, using linear Pearson's R correlation coefficients. The present results demonstrate differential methylation levels between HB and normal tissues, and imply that aberrant methylation of AFP in HB could reflect HB development. Expansion of these findings could provide useful insight into HB biology.

  18. Aspects of the incorporation of spatial data into radioecological and restoration analysis

    International Nuclear Information System (INIS)

    In the last decade geographical information systems have been increasingly used to incorporate spatial data into radioecological analysis. This has allowed the development of models with spatially variable outputs. Two main approaches have been adopted in the development of spatial models. Empirical Tag based models applied across a range of spatial scales utilize underlying soil type maps and readily available radioecological data. Soil processes can also be modelled to allow the dynamic prediction of radionuclide soil to plant transfer. We discuss a dynamic semi-mechanistic radiocaesium soil to plant-transfer model, which utilizes readily available spatially variable soil parameters. Both approaches allow the identification of areas that may be vulnerable to radionuclide deposition, therefore enabling the targeting of intervention measures. Improved estimates of radionuclide fluxes and ingestion doses can be achieved by incorporating spatially varying inputs such as agricultural production and dietary habits in to these models. In this paper, aspects of such models, including data requirements, implementation and outputs are discussed and critically evaluated. The relative merits and disadvantages of the two spatial model approaches adopted within radioecology are discussed. We consider the usefulness of such models to aid decision-makers and access the requirements and potential of further application within radiological protection. (author)

  19. G-language genome analysis environment with REST and SOAP web service interfaces

    OpenAIRE

    Arakawa, Kazuharu; Kido, Nobuhiro; Oshita, Kazuki; Tomita, Masaru

    2010-01-01

    G-language genome analysis environment (G-language GAE) contains more than 100 programs that focus on the analysis of bacterial genomes, including programs for the identification of binding sites by means of information theory, analysis of nucleotide composition bias and the distribution of particular oligonucleotides, calculation of codon bias and prediction of expression levels, and visualization of genomic information. We have provided a collection of web services for these programs by uti...

  20. A simplified computer method incorporating compartmental analysis with recycling for biokinetic studies of radionuclides

    International Nuclear Information System (INIS)

    In past few years ICRP has made major revisions in its recommendations regarding protection from ionizing radiations. It has developed a series of biokinetic and dosimetric models for calculating radiation doses from intake of radionuclides in the body. It has also developed a new Human Respiratory Tract (HRT) model for this purpose. The new models have been developed to enable dose estimates for radiation workers as well as the general public including the children of all age groups. The new HRT model has been incorporated for dose estimations only in a few standard codes like LUDEP and GENMOD where as the new biokinetic models have not been employed in any of the available codes, except the InDose. ICRP has presented retention and excretion data for some selected radionuclides using these new models in its Publication 78, and dose coefficients for most radionuclides for 1 and 5 μm AMAD size in ICRP Publications 72 and 68 respectively. These data have been provided to ICRP by some leading laboratories and the codes used for these data are generally not available to other laboratories. In this paper we describe a simplified computer method which incorporates compartmental analysis with recycling and can be used for biokinetic studies of various radionuclides for any plant specific/non standard input parameters, data for which can not be obtained from any of the ICRP publications. The method incorporates the compartmentalised form of the new HRT model, GI tract model and the new biokinetic model of 125Sb. It can calculate the amount of radioactivity at any future time t after the inhalation intake and the total number of disintegration (Us) over any time interval of interest in any organ. By operating SEE matrix it can calculate the equivalent dose and the effective dose along with the amount in excretion compartments for any given aerosol size. The method can be used for any radionuclide by incorporating its biokinetic model in compartmentalised form along with

  1. Analysis of dinucleotide signatures in HIV-1 subtype B genomes

    Indian Academy of Sciences (India)

    Aridaman Pandit; Jyothirmayi Vadlamudi; Somdatta Sinha

    2013-12-01

    Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007.We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection.

  2. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach

    Science.gov (United States)

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A.; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S.; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  3. Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering

    DEFF Research Database (Denmark)

    Ussery, David; Bohlin, Jon; Skjerve, Eystein

    2009-01-01

    Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 dif...... clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.......Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867...

  4. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  5. A new algorithm for grid-based hydrologic analysis by incorporating stormwater infrastructure

    Science.gov (United States)

    Choi, Yosoon; Yi, Huiuk; Park, Hyeong-Dong

    2011-08-01

    We developed a new algorithm, the Adaptive Stormwater Infrastructure (ASI) algorithm, to incorporate ancillary data sets related to stormwater infrastructure into the grid-based hydrologic analysis. The algorithm simultaneously considers the effects of the surface stormwater collector network (e.g., diversions, roadside ditches, and canals) and underground stormwater conveyance systems (e.g., waterway tunnels, collector pipes, and culverts). The surface drainage flows controlled by the surface runoff collector network are superimposed onto the flow directions derived from a DEM. After examining the connections between inlets and outfalls in the underground stormwater conveyance system, the flow accumulation and delineation of watersheds are calculated based on recursive computations. Application of the algorithm to the Sangdong tailings dam in Korea revealed superior performance to that of a conventional D8 single-flow algorithm in terms of providing reasonable hydrologic information on watersheds with stormwater infrastructure.

  6. Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods and Power Analysis

    OpenAIRE

    Levinson, Douglas F.; Levinson, Matthew D.; Segurado, Ricardo; Lewis, Cathryn M.

    2003-01-01

    This is the first of three articles on a meta-analysis of genome scans of schizophrenia (SCZ) and bipolar disorder (BPD) that uses the rank-based genome scan meta-analysis (GSMA) method. Here we used simulation to determine the power of GSMA to detect linkage and to identify thresholds of significance. We simulated replicates resembling the SCZ data set (20 scans; 1,208 pedigrees) and two BPD data sets using very narrow (9 scans; 347 pedigrees) and narrow (14 scans; 512 pedigrees) diagnoses. ...

  7. Transcriptome, methylome and genomic variations analysis of ectopic thyroid glands.

    Directory of Open Access Journals (Sweden)

    Rasha Abu-Khudir

    Full Text Available BACKGROUND: Congenital hypothyroidism from thyroid dysgenesis (CHTD is predominantly a sporadic disease characterized by defects in the differentiation, migration or growth of thyroid tissue. Of these defects, incomplete migration resulting in ectopic thyroid tissue is the most common (up to 80%. Germinal mutations in the thyroid-related transcription factors NKX2.1, FOXE1, PAX-8, and NKX2.5 have been identified in only 3% of patients with sporadic CHTD. Moreover, a survey of monozygotic twins yielded a discordance rate of 92%, suggesting that somatic events, genetic or epigenetic, probably play an important role in the etiology of CHTD. METHODOLOGY/PRINCIPAL FINDINGS: To assess the role of somatic genetic or epigenetic processes in CHTD, we analyzed gene expression, genome-wide methylation, and structural genome variations in normal versus ectopic thyroid tissue. In total, 1011 genes were more than two-fold induced or repressed. Expression array was validated by quantitative real-time RT-PCR for 100 genes. After correction for differences in thyroid activation state, 19 genes were exclusively associated with thyroid ectopy, among which genes involved in embryonic development (e.g. TXNIP and in the Wnt pathway (e.g. SFRP2 and FRZB were observed. None of the thyroid related transcription factors (FOXE1, HHEX, NKX2.1, NKX2.5 showed decreased expression, whereas PAX8 expression was associated with thyroid activation state. Finally, the expression profile was independent of promoter and CpG island methylation and of structural genome variations. CONCLUSIONS/SIGNIFICANCE: This is the first integrative molecular analysis of ectopic thyroid tissue. Ectopic thyroids show a differential gene expression compared to that of normal thyroids, although molecular basis could not be defined. Replication of this pilot study on a larger cohort could lead to unraveling the elusive cause of defective thyroid migration during embryogenesis.

  8. Microsporidian genome analysis reveals evolutionary strategies for obligate intracellular growth.

    Science.gov (United States)

    Cuomo, Christina A; Desjardins, Christopher A; Bakowski, Malina A; Goldberg, Jonathan; Ma, Amy T; Becnel, James J; Didier, Elizabeth S; Fan, Lin; Heiman, David I; Levin, Joshua Z; Young, Sarah; Zeng, Qiandong; Troemel, Emily R

    2012-12-01

    Microsporidia comprise a large phylum of obligate intracellular eukaryotes that are fungal-related parasites responsible for widespread disease, and here we address questions about microsporidia biology and evolution. We sequenced three microsporidian genomes from two species, Nematocida parisii and Nematocida sp1, which are natural pathogens of Caenorhabditis nematodes and provide model systems for studying microsporidian pathogenesis. We performed deep sequencing of transcripts from a time course of N. parisii infection. Examination of pathogen gene expression revealed compact transcripts and a dramatic takeover of host cells by Nematocida. We also performed phylogenomic analyses of Nematocida and other microsporidian genomes to refine microsporidian phylogeny and identify evolutionary events of gene loss, acquisition, and modification. In particular, we found that all microsporidia lost the tumor-suppressor gene retinoblastoma, which we speculate could accelerate the parasite cell cycle and increase the mutation rate. We also found that microsporidia acquired transporters that could import nucleosides to fuel rapid growth. In addition, microsporidian hexokinases gained secretion signal sequences, and in a functional assay these were sufficient to export proteins out of the cell; thus hexokinase may be targeted into the host cell to reprogram it toward biosynthesis. Similar molecular changes appear during formation of cancer cells and may be evolutionary strategies adopted independently by microsporidia to proliferate rapidly within host cells. Finally, analysis of genome polymorphisms revealed evidence for a sexual cycle that may provide genetic diversity to alleviate problems caused by clonal growth. Together these events may explain the emergence and success of these diverse intracellular parasites. PMID:22813931

  9. Recurrent parent genome recovery analysis in a marker-assisted backcrossing program of rice (Oryza sativa L.).

    Science.gov (United States)

    Miah, Gous; Rafii, Mohd Y; Ismail, Mohd R; Puteh, Adam B; Rahim, Harun A; Latif, Mohammad A

    2015-02-01

    Backcross breeding is the most commonly used method for incorporating a blast resistance gene into a rice cultivar. Linkage between the resistance gene and undesirable units can persist for many generations of backcrossing. Marker-assisted backcrossing (MABC) along with marker-assisted selection (MAS) contributes immensely to overcome the main limitation of the conventional breeding and accelerates recurrent parent genome (RPG) recovery. The MABC approach was employed to incorporate (a) blast resistance gene(s) from the donor parent Pongsu Seribu 1, the blast-resistant local variety in Malaysia, into the genetic background of MR219, a popular high-yielding rice variety that is blast susceptible, to develop a blast-resistant MR219 improved variety. In this perspective, the recurrent parent genome recovery was analyzed in early generations of backcrossing using simple sequence repeat (SSR) markers. Out of 375 SSR markers, 70 markers were found polymorphic between the parents, and these markers were used to evaluate the plants in subsequent generations. Background analysis revealed that the extent of RPG recovery ranged from 75.40% to 91.3% and from 80.40% to 96.70% in BC1F1 and BC2F1 generations, respectively. In this study, the recurrent parent genome content in the selected BC2F2 lines ranged from 92.7% to 97.7%. The average proportion of the recurrent parent in the selected improved line was 95.98%. MAS allowed identification of the plants that are more similar to the recurrent parent for the loci evaluated in backcross generations. The application of MAS with the MABC breeding program accelerated the recovery of the RP genome, reducing the number of generations and the time for incorporating resistance against rice blast. PMID:25553855

  10. Analysis of the bread wheat genome using whole-genome shotgun sequencing

    OpenAIRE

    Brenchley R.; Brenchley, Rachel; Spannagl M.; Spannagl, Manuel; Pfeifer M; Pfeifer, Matthias; Barker, Gary L. A.; Barker G.L.A.; D'Amore R.; D'Amore, Rosalinda; Allen A.M.; Allen, Alexandra M.; McKenzie, Neil; McKenzie N.; Kramer, Melissa

    2012-01-01

    Summary Bread wheat (Triticum aestivum) is a globally important crop, accounting for 20% of the calories consumed by mankind. We sequenced its large and challenging 17 Gb hexaploid genome using 454 pyrosequencing and compared this with the sequences of diploid ancestral and progenitor genomes. Between 94,000-96,000 genes were identified, and two-thirds were assigned to the A, B and D genomes. High-resolution synteny maps identified many small disruptions to conserved gene order. We show the h...

  11. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes

    OpenAIRE

    Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan

    2014-01-01

    Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancel...

  12. A SOLIDS ANALYSIS APPROACH INCORPORATING ARGON-ION MILLING TO COPPER AND LEAD PIPE SCALE ANALYSIS

    Science.gov (United States)

    Corrosion of copper and lead plumbing materials in water is complex and has been the topic of a number of studies on the topic (Lucey 1967; Edwards et al. 1994a; Edwards et al. 1994b; Duthil et al.1996; Harrison et al. 2004). Solids analysis is one of the most convenient and nfo...

  13. Genome analysis of partial amphiploids by means of in situ hybridization

    International Nuclear Information System (INIS)

    A combination of genomic in situ hybridization on parental lines and meiotic pairing analysis of hybrids was employed to identify the genomic constitutions and relationships between partial amphiploids derived from wheat and wheatgrass crosses. Partial amphiploid TAF46 derived from the backcrossing of a hybrid between wheat and Thinopyrum intermedium was found to contain a synthetic alien genome composed of six S genome chromosomes and eight E genome chromosomes. The six disomic addition lines produced from TAF46 consisted of two with S genome additions and four with E genome additions. The seven additional partial amphiploids analysed were divided into three groups on the basis of similarities in their meiotic behaviour and genomic in situ hybridization patterns. (author). 23 refs, 1 fig., 4 tabs

  14. An aeroelastic analysis of helicopter rotor blades incorporating piezoelectric fiber composite twist actuation

    Science.gov (United States)

    Wilkie, W. Keats; Park, K. C.

    1996-01-01

    A simple aeroelastic analysis of a helicopter rotor blade incorporating embedded piezoelectric fiber composite, interdigitated electrode blade twist actuators is described. The analysis consist of a linear torsion and flapwise bending model coupled with a nonlinear ONERA based unsteady aerodynamics model. A modified Galerkin procedure is performed upon the rotor blade partial differential equations of motion to develop a system of ordinary differential equations suitable for numerical integration. The twist actuation responses for three conceptual full-scale blade designs with realistic constraints on blade mass are numerically evaluated using the analysis. Numerical results indicate that useful amplitudes of nonresonant elastic twist, on the order of one to two degrees, are achievable under one-g hovering flight conditions for interdigitated electrode poling configurations. Twist actuation for the interdigitated electrode blades is also compared with the twist actuation of a conventionally poled piezoelectric fiber composite blade. Elastic twist produced using the interdigitated electrode actuators was found to be four to five times larger than that obtained with the conventionally poled actuators.

  15. Aeroelastic Analysis of Helicopter Rotor Blades Incorporating Anisotropic Piezoelectric Twist Actuation

    Science.gov (United States)

    Wilkie, W. Keats; Belvin, W. Keith; Park, K. C.

    1996-01-01

    A simple aeroelastic analysis of a helicopter rotor blade incorporating embedded piezoelectric fiber composite, interdigitated electrode blade twist actuators is described. The analysis consists of a linear torsion and flapwise bending model coupled with a nonlinear ONERA based unsteady aerodynamics model. A modified Galerkin procedure is performed upon the rotor blade partial differential equations of motion to develop a system of ordinary differential equations suitable for dynamics simulation using numerical integration. The twist actuation responses for three conceptual fullscale blade designs with realistic constraints on blade mass are numerically evaluated using the analysis. Numerical results indicate that useful amplitudes of nonresonant elastic twist, on the order of one to two degrees, are achievable under one-g hovering flight conditions for interdigitated electrode poling configurations. Twist actuation for the interdigitated electrode blades is also compared with the twist actuation of a conventionally poled piezoelectric fiber composite blade. Elastic twist produced using the interdigitated electrode actuators was found to be four to five times larger than that obtained with the conventionally poled actuators.

  16. Comparative genomic analysis of Vibrio parahaemolyticus: serotype conversion and virulence

    Directory of Open Access Journals (Sweden)

    Gil Ana I

    2011-06-01

    Full Text Available Abstract Background Vibrio parahaemolyticus is a common cause of foodborne disease. Beginning in 1996, a more virulent strain having serotype O3:K6 caused major outbreaks in India and other parts of the world, resulting in the emergence of a pandemic. Other serovariants of this strain emerged during its dissemination and together with the original O3:K6 were termed strains of the pandemic clone. Two genomes, one of this virulent strain and one pre-pandemic strain have been sequenced. We sequenced four additional genomes of V. parahaemolyticus in this study that were isolated from different geographical regions and time points. Comparative genomic analyses of six strains of V. parahaemolyticus isolated from Asia and Peru were performed in order to advance knowledge concerning the evolution of V. parahaemolyticus; specifically, the genetic changes contributing to serotype conversion and virulence. Two pre-pandemic strains and three pandemic strains, isolated from different geographical regions, were serotype O3:K6 and either toxin profiles (tdh+, trh- or (tdh-, trh+. The sixth pandemic strain sequenced in this study was serotype O4:K68. Results Genomic analyses revealed that the trh+ and tdh+ strains had different types of pathogenicity islands and mobile elements as well as major structural differences between the tdh pathogenicity islands of the pre-pandemic and pandemic strains. In addition, the results of single nucleotide polymorphism (SNP analysis showed that 94% of the SNPs between O3:K6 and O4:K68 pandemic isolates were within a 141 kb region surrounding the O- and K-antigen-encoding gene clusters. The "core" genes of V. parahaemolyticus were also compared to those of V. cholerae and V. vulnificus, in order to delineate differences between these three pathogenic species. Approximately one-half (49-59% of each species' core genes were conserved in all three species, and 14-24% of the core genes were species-specific and in different

  17. Genomic analysis of stress response against arsenic in Caenorhabditis elegans.

    Directory of Open Access Journals (Sweden)

    Surasri N Sahu

    Full Text Available Arsenic, a known human carcinogen, is widely distributed around the world and found in particularly high concentrations in certain regions including Southwestern US, Eastern Europe, India, China, Taiwan and Mexico. Chronic arsenic poisoning affects millions of people worldwide and is associated with increased risk of many diseases including arthrosclerosis, diabetes and cancer. In this study, we explored genome level global responses to high and low levels of arsenic exposure in Caenorhabditis elegans using Affymetrix expression microarrays. This experimental design allows us to do microarray analysis of dose-response relationships of global gene expression patterns. High dose (0.03% exposure caused stronger global gene expression changes in comparison with low dose (0.003% exposure, suggesting a positive dose-response correlation. Biological processes such as oxidative stress, and iron metabolism, which were previously reported to be involved in arsenic toxicity studies using cultured cells, experimental animals, and humans, were found to be affected in C. elegans. We performed genome-wide gene expression comparisons between our microarray data and publicly available C. elegans microarray datasets of cadmium, and sediment exposure samples of German rivers Rhine and Elbe. Bioinformatics analysis of arsenic-responsive regulatory networks were done using FastMEDUSA program. FastMEDUSA analysis identified cancer-related genes, particularly genes associated with leukemia, such as dnj-11, which encodes a protein orthologous to the mammalian ZRF1/MIDA1/MPP11/DNAJC2 family of ribosome-associated molecular chaperones. We analyzed the protective functions of several of the identified genes using RNAi. Our study indicates that C. elegans could be a substitute model to study the mechanism of metal toxicity using high-throughput expression data and bioinformatics tools such as FastMEDUSA.

  18. Preliminary analysis of the mitochondrial genome evolutionary pattern in primates

    Institute of Scientific and Technical Information of China (English)

    Liang ZHAO; Xingtao ZHANG; Xingkui TAO; Weiwei WANG; Ming LI

    2012-01-01

    Since the birth of molecular evolutionary analysis,primates have been a central focus of study and mitochondrial DNA is well suited to these endeavors because of its unique features.Surprisingly,to date no comprehensive evaluation of the nucleotide substitution patterns has been conducted on the mitochondrial genome of primates.Here,we analyzed the evolutionary patterns and evaluated selection and recombination in the mitochondrial genomes of 44 Primates species downloaded from GenBank.The results revealed that a strong rate heterogeneity occurred among sites and genes in all comparisons.Likewise,an obvious decline in primate nucleotide diversity was noted in the subunit rRNAs and tRNAs as compared to the protein-coding genes.Within 13 protein-coding genes,the pattern of nonsynonymous divergence was similar to that of overall nucleotide divergence,while synonymous changes differed only for individual genes,indicating that the rate heterogeneity may result from the rate of change at nonsynonymous sites.Codon usage analysis revealed that there was intermediate codon usage bias in primate protein-coding genes,and supported the idea that GC mutation pressure might determine codon usage and that positive selection is not the driving force for the codon usage bias.Neutrality tests using site-specific positive selection from a Bayesian framework indicated no sites were under positive selection for any gene,consistent with near neutrality.Recombination tests based on the pairwise homoplasy test statistic supported complete linkage even for much older divergent primate species.Thus,with the exception of rate heterogeneity among mitochondrial genes,evaluating the validity assumed complete linkage and selective neutrality in primates prior to phylogenetic or phylogeographic analysis seems unnecessary.

  19. Secure distributed genome analysis for GWAS and sequence comparison computation

    OpenAIRE

    Zhang, Yihua; Blanton, Marina; Almashaqbeh, Ghada

    2015-01-01

    Background The rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center. Methods In this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-...

  20. Sequence motif discovery with computational genome-wide analysis

    OpenAIRE

    Akashi, Hirofumi; Aoki, Fumio; Toyota, Minoru; Maruyama, Reo; Sasaki, Yasushi; Mita, Hiroaki; Tokura, Hajime; Imai, Kohzoh; Tatsumi, Haruyuki

    2006-01-01

    As a result of the human genome project and advancements in DNA sequencing technology, we can utilize a huge amount of nucleotide sequence data and can search DNA sequence motifs in whole human genome. However, searching motifs with the naked eye is an enormous task and searching throughout the whole genome is absolutely impossible. Therefore, we have developed a computational genome-wide analyzing system for detecting DNA sequence motifs with biological significance. We used a multi-parallel...

  1. 13C metabolic flux analysis at a genome-scale.

    Science.gov (United States)

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  2. Genome-wide association analysis for quantitative trait loci influencing Warner–Bratzler shear force in five taurine cattle breeds

    Science.gov (United States)

    McClure, M C; Ramey, H R; Rolf, M M; McKay, S D; Decker, J E; Chapple, R H; Kim, J W; Taxis, T M; Weaber, R L; Schnabel, R D; Taylor, J F

    2012-01-01

    Summary We performed a genome-wide association study for Warner–Bratzler shear force (WBSF), a measure of meat tenderness, by genotyping 3360 animals from five breeds with 54 790 BovineSNP50 and 96 putative single-nucleotide polymorphisms (SNPs) within μ-calpain [HUGO nomenclature calpain 1, (mu/I) large subunit; CAPN1] and calpastatin (CAST). Within- and across-breed analyses estimated SNP allele substitution effects (ASEs) by genomic best linear unbiased prediction (GBLUP) and variance components by restricted maximum likelihood under an animal model incorporating a genomic relationship matrix. GBLUP estimates of ASEs from the across-breed analysis were moderately correlated (0.31–0.66) with those from the individual within-breed analyses, indicating that prediction equations for molecular estimates of breeding value developed from across-breed analyses should be effective for genomic selection within breeds. We identified 79 genomic regions associated with WBSF in at least three breeds, but only eight were detected in all five breeds, suggesting that the within-breed analyses were underpowered, that different quantitative trait loci (QTL) underlie variation between breeds or that the BovineSNP50 SNP density is insufficient to detect common QTL among breeds. In the across-breed analysis, CAPN1 was followed by CAST as the most strongly associated WBSF QTL genome-wide, and associations with both were detected in all five breeds. We show that none of the four commercialized CAST and CAPN1SNP diagnostics are causal for associations with WBSF, and we putatively fine-map the CAPN1 causal mutation to a 4581-bp region. We estimate that variation in CAST and CAPN1 explains 1.02 and 1.85% of the phenotypic variation in WBSF respectively. PMID:22497286

  3. Technology-Driven and Evidence-Based Genomic Analysis for Integrated Pediatric and Prenatal Genetics Evaluation

    Institute of Scientific and Technical Information of China (English)

    Yuan Wei; Fang Xu; Peining Li

    2013-01-01

    The first decade since the completion of the Human Genome Project has been marked with rapid development of genomic technologies and their immediate clinical applications.Genomic analysis using oligonucleotide array comparative genomic hybridization (aCGH) or single nucleotide polymorphism (SNP) chips has been applied to pediatric patients with developmental and intellectual disabilities (DD/ID),multiple congenital anomalies (MCA) and autistic spectrum disorders (ASD).Evaluation of analytical and clinical validities of aCGH showed > 99% sensitivity and specificity and increased analytical resolution by higher density probe coverage.Reviews of case series,multi-center comparison and large patient-control studies demonstrated a diagnostic yield of 12%-20%; approximately 60% of these abnormalities were recurrent genomic disorders.This pediatric experience has been extended toward prenatal diagnosis.A series of reports indicated approximately 10% of pregnancies with ultrasound-detected structural anomalies and normal cytogenetic findings had genomic abnormalities,and 30% of these abnormalities were syndromic genomic disorders.Evidence-based practice guidelines and standards for implementing genomic analysis and web-delivered knowledge resources for interpreting genomic findings have been established.The progress from this technology-driven and evidence-based genomic analysis provides not only opportunities to dissect disease-causing mechanisms and develop rational therapeutic interventions but also important lessons for integrating genomic sequencing into pediatric and prenatal genetic evaluation.

  4. Research study on analysis/use technologies of genome information; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    For wide use of genome information in the industrial field, the required R and D was surveyed from the standpoints of biology and information science. To clarify the present state and issues of the international research on genome analysis, the genome map as well as sequence and function information are first surveyed. The current analysis/use technologies of genome information are analyzed, and the following are summarized: prediction and identification of gene regions in genome sequences, techniques for searching and selecting useful genes, and techniques for predicting the expression of gene functions and the gene-product structure and functions. It is recommended that R and D and data collection/interpretation necessary to clarify inter-gene interactions and information networks should be promoted by integrating Japanese advanced know-how and technologies. As examples of the impact of the research results on industry and society, the present state and future expected effect are summarized for medicines, diagnosis/analysis instruments, chemicals, foods, agriculture, fishery, animal husbandry, electronics, environment and information. 278 refs., 42 figs., 5 tabs.

  5. Genome-scale metabolic analysis of Clostridium thermocellum for bioethanol production

    Directory of Open Access Journals (Sweden)

    Brooks J Paul

    2010-03-01

    Full Text Available Abstract Background Microorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405 is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous. Results Here we present a genome-scale model of C. thermocellum metabolism, iSR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the iSR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production. Conclusions By incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum

  6. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the oth

  7. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae : Implications for the microbial "pan-genome"

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Donati, C; Medini, D; Ward, NL; Angiuoli, SV; Crabtree, J; Jones, AL; Durkin, AS; DeBoy, RT; Davidsen, TM; Mora, M; Scarselli, M; Ros, IMY; Peterson, JD; Hauser, CR; Sundaram, JP; Nelson, WC; Madupu, R; Brinkac, LM; Dodson, RJ; Rosovitz, MJ; Sullivan, SA; Daugherty, SC; Haft, DH; Selengut, J; Gwinn, ML; Zhou, LW; Zafar, N; Khouri, H; Radune, D; Dimitrov, G; Watkins, K; O'Connor, KJB; Smith, S; Utterback, TR; White, O; Rubens, CE; Grandi, G; Madoff, LC; Kasper, DL; Telford, JL; Wessels, MR; Rappuoli, R; Fraser, CM

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and als

  8. Factor Analysis and Framework Development for Incorporating Public Trust on Nuclear Safety issues

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Seongkyung; Lee, Gyebong [The Myongji Univ., Seoul (Korea, Republic of); Lee, Gihyung; Lee, Gyehwi; Jeong, Jina [Korea Institute of Nuclear Safety, Daejeon (Korea, Republic of)

    2014-05-15

    The Korea Institute of Nuclear Safety (KINS), a regulatory expert organization in charge of nuclear safety in Korea, realized that a more fundamental and systematic analysis of activities is needed to actively meet the greater variety of concerns people have and increase the reliability of the results of regulation. Nuclear safety, a highly specialized field, has previously been discussed primarily from the viewpoint of the engineers who deal with the technology, but now 'public trust in nuclear safety' has to be viewed from the standpoint of the general public and from the socio-cultural perspective. Specific measures must be taken to examine which factors affect public trust and how we can secure and reproduce those factors to gain it. Also, an efficient system for incorporating public trust in nuclear safety must be established. In this study, various case studies were examined to identify the factors that affect public trust in nuclear safety. First, nuclear safety laws and information disclosure systems of major countries were examined by investigating data and conducting in-depth interviews. To explore a public framework concerning nuclear safety, big data of social media were analyzed. Also, Q methodology was used to analyze the risk schemata of the opinion leaders living in areas near nuclear power plants. Several surveys were conducted to analyze the amount of trust the public had in nuclear safety as well as their awareness of nuclear safety issues. Based on these analyses, factors affecting public trust in nuclear safety were extracted, and measures to build systems incorporating public trust in nuclear safety were proposed. This study addresses the public trust in nuclear safety on condition that the safety is ensured technically and mechanically.

  9. Factor Analysis and Framework Development for Incorporating Public Trust on Nuclear Safety issues

    International Nuclear Information System (INIS)

    The Korea Institute of Nuclear Safety (KINS), a regulatory expert organization in charge of nuclear safety in Korea, realized that a more fundamental and systematic analysis of activities is needed to actively meet the greater variety of concerns people have and increase the reliability of the results of regulation. Nuclear safety, a highly specialized field, has previously been discussed primarily from the viewpoint of the engineers who deal with the technology, but now 'public trust in nuclear safety' has to be viewed from the standpoint of the general public and from the socio-cultural perspective. Specific measures must be taken to examine which factors affect public trust and how we can secure and reproduce those factors to gain it. Also, an efficient system for incorporating public trust in nuclear safety must be established. In this study, various case studies were examined to identify the factors that affect public trust in nuclear safety. First, nuclear safety laws and information disclosure systems of major countries were examined by investigating data and conducting in-depth interviews. To explore a public framework concerning nuclear safety, big data of social media were analyzed. Also, Q methodology was used to analyze the risk schemata of the opinion leaders living in areas near nuclear power plants. Several surveys were conducted to analyze the amount of trust the public had in nuclear safety as well as their awareness of nuclear safety issues. Based on these analyses, factors affecting public trust in nuclear safety were extracted, and measures to build systems incorporating public trust in nuclear safety were proposed. This study addresses the public trust in nuclear safety on condition that the safety is ensured technically and mechanically

  10. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  11. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  12. Analysis of intra-genomic GC content homogeneity within prokaryotes

    DEFF Research Database (Denmark)

    Bohlin, J; Snipen, L; Hardy, S.P.;

    2010-01-01

    Bacterial genomes possess varying GC content (total guanines (Gs) and cytosines (Cs) per total of the four bases within the genome) but within a given genome, GC content can vary locally along the chromosome, with some regions significantly more or less GC rich than on average. We have examined how...... the GC content varies within microbial genomes to assess whether this property can be associated with certain biological functions related to the organism's environment and phylogeny. We utilize a new quantity GCVAR, the intra-genomic GC content variability with respect to the average GC content of...

  13. Genomic analysis suggests higher susceptibility of children to air pollution

    DEFF Research Database (Denmark)

    van Leeuwen, Danitsja M; Pedersen, Marie; Hendriksen, Peter J M;

    2008-01-01

    pollution by comparing genome-wide gene expression profiles in peripheral blood of children and their parents. Gene expression analysis was performed in blood from children and parents living in two different regions in the Czech Republic with different levels of air pollution. Data were analyzed by two...... in relation to air pollution exposure at the transcriptome level. The findings underline the necessity of implementing environmental health policy measures specifically for protecting children's health.......Differences in biological responses to exposure to hazardous airborne substances between children and adults have been reported, suggesting children to be more susceptible. Aim of this study was to improve our understanding of differences in susceptibility in cancer risk associated with air...

  14. Stochastic modelling of landfill leachate and biogas production incorporating waste heterogeneity. Model formulation and uncertainty analysis

    International Nuclear Information System (INIS)

    A mathematical model simulating the hydrological and biochemical processes occurring in landfilled waste is presented and demonstrated. The model combines biochemical and hydrological models into an integrated representation of the landfill environment. Waste decomposition is modelled using traditional biochemical waste decomposition pathways combined with a simplified methodology for representing the rate of decomposition. Water flow through the waste is represented using a statistical velocity model capable of representing the effects of waste heterogeneity on leachate flow through the waste. Given the limitations in data capture from landfill sites, significant emphasis is placed on improving parameter identification and reducing parameter requirements. A sensitivity analysis is performed, highlighting the model's response to changes in input variables. A model test run is also presented, demonstrating the model capabilities. A parameter perturbation model sensitivity analysis was also performed. This has been able to show that although the model is sensitive to certain key parameters, its overall intuitive response provides a good basis for making reasonable predictions of the future state of the landfill system. Finally, due to the high uncertainty associated with landfill data, a tool for handling input data uncertainty is incorporated in the model's structure. It is concluded that the model can be used as a reasonable tool for modelling landfill processes and that further work should be undertaken to assess the model's performance

  15. The Korea Brassica Genome Project: a Glimpse of the Brassica Genome Based on Comparative Genome Analysis With Arabidopsis

    Directory of Open Access Journals (Sweden)

    Beom-Seok Park

    2006-04-01

    Full Text Available A complete genome sequence provides unlimited information in the sequenced organism as well as in related taxa. According to the guidance of the Multinational Brassica Genome Project (MBGP, the Korea Brassica Genome Project (KBGP is sequencing chromosome 1 (cytogenetically oriented chromosome #1 of Brassica rapa. We have selected 48 seed BACs on chromosome 1 using EST genetic markers and FISH analyses. Among them, 30 BAC clones have been sequenced and 18 are on the way. Comparative genome analyses of the EST sequences and sequenced BAC clones from Brassica chromosome 1 revealed their homeologous partner regions on the Arabidopsis genome and a syntenic comparative map between Brassica chromosome 1 and Arabidopsis chromosomes. In silico chromosome walking and clone validation have been successfully applied to extending sequence contigs based on the comparative map and BAC end sequences. In addition, we have defined the (pericentromeric heterochromatin blocks with centromeric tandem repeats, rDNA and centromeric retrotransposons. In-depth sequence analyses of five homeologous BAC clones and an Arabidopsis chromosomal region reveal overall co-linearity, with 82% sequence similarity. The data indicate that the Brassica genome has undergone triplication and subsequent gene losses after the divergence of Arabidopsis and Brassica. Based on in-depth comparative genome analyses, we propose a comparative genomics approach for conquering the Brassica genome. In 2005 we intend to construct an integrated physical map, including sequence information from 500 BAC clones and integration of fingerprinting data and end sequence data of more than 100 000 BAC clones. The sequences have been submitted to GenBank with accession numbers: 10 204 BAC ends of the KBrH library (CW978640–CW988843; KBrH138P04, AC155338; KBrH117N09, AC155337; KBrH097M21, AC155348; KBrH093K03, AC155347; KBrH081N08, AC155346; KBrH080L24, AC155345; KBrH077A05, AC155343; KBrH020D15

  16. Functional Analysis of Shewanella, a cross genome comparison.

    Energy Technology Data Exchange (ETDEWEB)

    Serres, Margrethe H.

    2009-05-15

    The bacterial genus Shewanella includes a group of highly versatile organisms that have successfully adapted to life in many environments ranging from aquatic (fresh and marine) to sedimentary (lake and marine sediments, subsurface sediments, sea vent). A unique respiratory capability of the Shewanellas, initially observed for Shewanella oneidensis MR-1, is the ability to use metals and metalloids, including radioactive compounds, as electron acceptors. Members of the Shewanella genus have also been shown to degrade environmental pollutants i.e. halogenated compounds, making this group highly applicable for the DOE mission. S. oneidensis MR-1 has in addition been found to utilize a diverse set of nutrients and to have a large set of genes dedicated to regulation and to sensing of the environment. The sequencing of the S. oneidensis MR-1 genome facilitated experimental and bioinformatics analyses by a group of collaborating researchers, the Shewanella Federation. Through the joint effort and with support from Department of Energy S. oneidensis MR-1 has become a model organism of study. Our work has been a functional analysis of S. oneidensis MR-1, both by itself and as part of a comparative study. We have improved the annotation of gene products, assigned metabolic functions, and analyzed protein families present in S. oneidensis MR-1. The data has been applied to analysis of experimental data (i.e. gene expression, proteome) generated for S. oneidensis MR-1. Further, this work has formed the basis for a comparative study of over 20 members of the Shewanella genus. The species and strains selected for genome sequencing represented an evolutionary gradient of DNA relatedness, ranging from close to intermediate, and to distant. The organisms selected have also adapted to a variety of ecological niches. Through our work we have been able to detect and interpret genome similarities and differences between members of the genus. We have in this way contributed to the

  17. Identification of conserved regulatory elements by comparative genome analysis

    Directory of Open Access Journals (Sweden)

    Jareborg Niclas

    2003-05-01

    Full Text Available Abstract Background For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. Results We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. Conclusions Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.

  18. Clinical Omics Analysis of Colorectal Cancer Incorporating Copy Number Aberrations and Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Tsuyoshi Yoshida

    2010-07-01

    Full Text Available Background: Colorectal cancer (CRC is one of the most frequently occurring cancers in Japan, and thus a wide range of methods have been deployed to study the molecular mechanisms of CRC. In this study, we performed a comprehensive analysis of CRC, incorporating copy number aberration (CRC and gene expression data. For the last four years, we have been collecting data from CRC cases and organizing the information as an “omics” study by integrating many kinds of analysis into a single comprehensive investigation. In our previous studies, we had experienced difficulty in finding genes related to CRC, as we observed higher noise levels in the expression data than in the data for other cancers. Because chromosomal aberrations are often observed in CRC, here, we have performed a combination of CNA analysis and expression analysis in order to identify some new genes responsible for CRC. This study was performed as part of the Clinical Omics Database Project at Tokyo Medical and Dental University. The purpose of this study was to investigate the mechanism of genetic instability in CRC by this combination of expression analysis and CNA, and to establish a new method for the diagnosis and treatment of CRC. Materials and methods: Comprehensive gene expression analysis was performed on 79 CRC cases using an Affymetrix Gene Chip, and comprehensive CNA analysis was performed using an Affymetrix DNA Sty array. To avoid the contamination of cancer tissue with normal cells, laser micro-dissection was performed before DNA/RNA extraction. Data analysis was performed using original software written in the R language. Result: We observed a high percentage of CNA in colorectal cancer, including copy number gains at 7, 8q, 13 and 20q, and copy number losses at 8p, 17p and 18. Gene expression analysis provided many candidates for CRC-related genes, but their association with CRC did not reach the level of statistical significance. The combination of CNA and gene

  19. Fluorescence-based DNA minisequence analysis for detection of known single-base changes in genomic DNA.

    Science.gov (United States)

    Kobayashi, M; Rappaport, E; Blasband, A; Semeraro, A; Sartore, M; Surrey, S; Fortina, P

    1995-06-01

    We describe a rapid, automated method for direct detection of known single-base changes in genomic DNA. Fluorescence-based DNA minisequence analysis is employed in a template-dependent reaction which involves a single nucleotide extension of an oligonucleotide primer by the correct fluorescently-tagged dideoxynucleotide chain terminator. Detection following electrophoresis on denaturing acrylamide gels is facilitated by alkaline phosphatase treatment of reaction products after extension followed by isopropanol precipitation of the dye-tagged, single-base-extended primer to remove unincorporated deoxynucleotides. Fluorescence analysis of the incorporated dye tag reveals the identity of the template nucleotide immediately 3' to the primer site. This technique does not require radioactivity or biotinylated PCR product, relies on the incorporation of a single dideoxynucleotide terminator to extend the primer by one nucleotide and takes advantage of the sensitivity of fluorescent terminators developed for automated DNA sequence analysis. As a demonstration, we have applied the assay to human genomic DNA for detection of the sickle mutation in the beta-globin gene, and have also examined feasibility for simultaneous delineation using a multiplex-like strategy in a single gel-lane of some of the most common beta-thalassemia mutations in the Mediterranean basin. PMID:7477010

  20. Genome-wide DNA methylation analysis in hepatocellular carcinoma.

    Science.gov (United States)

    Yamada, Nobuhisa; Yasui, Kohichiroh; Dohi, Osamu; Gen, Yasuyuki; Tomie, Akira; Kitaichi, Tomoko; Iwai, Naoto; Mitsuyoshi, Hironori; Sumida, Yoshio; Moriguchi, Michihisa; Yamaguchi, Kanji; Nishikawa, Taichiro; Umemura, Atsushi; Naito, Yuji; Tanaka, Shinji; Arii, Shigeki; Itoh, Yoshito

    2016-04-01

    Epigenetic changes as well as genetic changes are mechanisms of tumorigenesis. We aimed to identify novel genes that are silenced by DNA hypermethylation in hepatocellular carcinoma (HCC). We screened for genes with promoter DNA hypermethylation using a genome-wide methylation microarray analysis in primary HCC (the discovery set). The microarray analysis revealed that there were 2,670 CpG sites that significantly differed in regards to the methylation level between the tumor and non-tumor liver tissues; 875 were significantly hypermethylated and 1,795 were significantly hypomethylated in the HCC tumors compared to the non‑tumor tissues. Further analyses using methylation-specific PCR, combined with expression analysis, in the validation set of primary HCC showed that, in addition to three known tumor-suppressor genes (APC, CDKN2A, and GSTP1), eight genes (AKR1B1, GRASP, MAP9, NXPE3, RSPH9, SPINT2, STEAP4, and ZNF154) were significantly hypermethylated and downregulated in the HCC tumors compared to the non-tumor liver tissues. Our results suggest that epigenetic silencing of these genes may be associated with HCC. PMID:26883180

  1. GENOME-WIDE ASSOCIATION ANALYSIS FOR FEED EFFICIENCY IN ANGUS CATTLE

    Science.gov (United States)

    Phenotypes for average daily feed intake (AFI; kg/d), residual feed intake (RFI; kg/d), average daily gain (ADG; kg/d) and predicted dry matter required (pDMR; kg/d) were estimated by correcting field records for effects of pen, year and season using a mixed linear model incorporating genomic relati...

  2. Comparative Analysis of CpG Islands in Four Fish Genomes

    Directory of Open Access Journals (Sweden)

    Leng Han

    2008-01-01

    Full Text Available There has been much interest in CpG islands (CGIs, clusters of CpG dinucleotides in GC-rich regions, because they are considered gene markers and involved in gene regulation. To date, there has been no genome-wide analysis of CGIs in the fish genome. We first evaluated the performance of three popular CGI identification algorithms in four fish genomes (tetraodon, stickleback, medaka, and zebrafish. Our results suggest that Takai and Jones' (2002 algorithm is most suitable for comparative analysis of CGIs in the fish genome. Then, we performed a systematic analysis of CGIs in the four fish genomes using Takai and Jones' algorithm, compared to other vertebrate genomes. We found that both the number of CGIs and the CGI density vary greatly among these genomes. Remarkably, each fish genome presents a distinct distribution of CGI density with some genomic factors (e.g., chromosome size and chromosome GC content. These findings are helpful for understanding evolution of fish genomes and the features of fish CGIs.

  3. Genome-wide analysis of alternative splicing in Chlamydomonas reinhardtii

    Directory of Open Access Journals (Sweden)

    Thomas Julie

    2010-02-01

    Full Text Available Abstract Background Genome-wide computational analysis of alternative splicing (AS in several flowering plants has revealed that pre-mRNAs from about 30% of genes undergo AS. Chlamydomonas, a simple unicellular green alga, is part of the lineage that includes land plants. However, it diverged from land plants about one billion years ago. Hence, it serves as a good model system to study alternative splicing in early photosynthetic eukaryotes, to obtain insights into the evolution of this process in plants, and to compare splicing in simple unicellular photosynthetic and non-photosynthetic eukaryotes. We performed a global analysis of alternative splicing in Chlamydomonas reinhardtii using its recently completed genome sequence and all available ESTs and cDNAs. Results Our analysis of AS using BLAT and a modified version of the Sircah tool revealed AS of 498 transcriptional units with 611 events, representing about 3% of the total number of genes. As in land plants, intron retention is the most prevalent form of AS. Retained introns and skipped exons tend to be shorter than their counterparts in constitutively spliced genes. The splice site signals in all types of AS events are weaker than those in constitutively spliced genes. Furthermore, in alternatively spliced genes, the prevalent splice form has a stronger splice site signal than the non-prevalent form. Analysis of constitutively spliced introns revealed an over-abundance of motifs with simple repetitive elements in comparison to introns involved in intron retention. In almost all cases, AS results in a truncated ORF, leading to a coding sequence that is around 50% shorter than the prevalent splice form. Using RT-PCR we verified AS of two genes and show that they produce more isoforms than indicated by EST data. All cDNA/EST alignments and splice graphs are provided in a website at http://combi.cs.colostate.edu/as/chlamy. Conclusions The extent of AS in Chlamydomonas that we observed is much

  4. PhyloSift: phylogenetic analysis of genomes and metagenomes

    Directory of Open Access Journals (Sweden)

    Aaron E. Darling

    2014-01-01

    Full Text Available Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection. In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata. These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454.

  5. Intraspecific phylogenetic analysis of Siberian woolly mammoths using complete mitochondrial genomes

    DEFF Research Database (Denmark)

    Gilbert, M Thomas P; Drautz, Daniela I; Lesk, Arthur M;

    2008-01-01

    We report five new complete mitochondrial DNA (mtDNA) genomes of Siberian woolly mammoth (Mammuthus primigenius), sequenced with up to 73-fold coverage from DNA extracted from hair shaft material. Three of the sequences present the first complete mtDNA genomes of mammoth clade II. Analysis of these...... and 13 recently published mtDNA genomes demonstrates the existence of two apparently sympatric mtDNA clades that exhibit high interclade divergence. The analytical power afforded by the analysis of the complete mtDNA genomes reveals a surprisingly ancient coalescence age of the two clades...

  6. Incorporating uncertainty analysis into life cycle estimates of greenhouse gas emissions from biomass production

    International Nuclear Information System (INIS)

    Before further investments are made in utilizing biomass as a source of renewable energy, both policy makers and the energy industry need estimates of the net greenhouse gas (GHG) reductions expected from substituting biobased fuels for fossil fuels. Such GHG reductions depend greatly on how the biomass is cultivated, transported, processed, and converted into fuel or electricity. Any policy aiming to reduce GHGs with biomass-based energy must account for uncertainties in emissions at each stage of production, or else it risks yielding marginal reductions, if any, while potentially imposing great costs. This paper provides a framework for incorporating uncertainty analysis specifically into estimates of the life cycle GHG emissions from the production of biomass. We outline the sources of uncertainty, discuss the implications of uncertainty and variability on the limits of life cycle assessment (LCA) models, and provide a guide for practitioners to best practices in modeling these uncertainties. The suite of techniques described herein can be used to improve the understanding and the representation of the uncertainties associated with emissions estimates, thus enabling improved decision making with respect to the use of biomass for energy and fuel production. -- Highlights: → We describe key model, scenario and data uncertainties in LCAs of biobased fuels. → System boundaries and allocation choices should be consistent with study goals. → Scenarios should be designed around policy levers that can be controlled. → We describe a new way to analyze the importance of covariance between inputs.

  7. Sensitivity Analysis of Flutter Response of a Wing Incorporating Finite-Span Corrections

    Science.gov (United States)

    Issac, Jason Cherian; Kapania, Rakesh K.; Barthelemy, Jean-Francois M.

    1994-01-01

    Flutter analysis of a wing is performed in compressible flow using state-space representation of the unsteady aerodynamic behavior. Three different expressions are used to incorporate corrections due to the finite-span effects of the wing in estimating the lift-curve slope. The structural formulation is based on a Rayleigh-Pitz technique with Chebyshev polynomials used for the wing deflections. The aeroelastic equations are solved as an eigen-value problem to determine the flutter speed of the wing. The flutter speeds are found to be higher in these cases, when compared to that obtained without accounting for the finite-span effects. The derivatives of the flutter speed with respect to the shape parameters, namely: aspect ratio, area, taper ratio and sweep angle, are calculated analytically. The shape sensitivity derivatives give a linear approximation to the flutter speed curves over a range of values of the shape parameter which is perturbed. Flutter and sensitivity calculations are performed on a wing using a lifting-surface unsteady aerodynamic theory using modules from a system of programs called FAST.

  8. Incorporating temporal variability to improve geostatistical analysis of satellite-observed CO2 in China

    Institute of Scientific and Technical Information of China (English)

    ZENG ZhaoCheng; LEI LiPing; GUO LiJie; ZHANG Li; ZHANG Bing

    2013-01-01

    Observations of atmospheric carbon dioxide (CO2) from satellites offer new data sources to understand global carbon cycling.The correlation structure of satellite-observed CO2 can be analyzed and modeled by geostatistical methods,and CO2 values at unsampled locations can be predicted with a correlation model.Conventional geostatistical analysis only investigates the spatial correlation of CO2,and does not consider temporal variation in the satellite-observed CO2 data.In this paper,a spatiotemporal geostatistical method that incorporates temporal variability is implemented and assessed for analyzing the spatiotemporal correlation structure and prediction of monthly CO2 in China.The spatiotemporal correlation is estimated and modeled by a product-sum variogram model with a global nugget component.The variogram result indicates a significant degree of temporal correlation within satellite-observed CO2 data sets in China.Prediction of monthly CO2 using the spatiotemporal variogram model and spacetime kriging procedure is implemented.The prediction is compared with a spatial-only geostatistical prediction approach using a cross-validation technique.The spatiotemporal approach gives better results,with higher correlation coefficient (r2),and less mean absolute prediction error and root mean square error.Moreover,the monthly mapping result generated from the spatiotemporal approach has less prediction uncertainty and more detailed spatial variation of CO2 than those from the spatial-only approach.

  9. Incorporation of a Wind Generator Model into a Dynamic Power Flow Analysis

    Directory of Open Access Journals (Sweden)

    Angeles-Camacho C.

    2011-07-01

    Full Text Available Wind energy is nowadays one of the most cost-effective and practical options for electric generation from renewable resources. However, increased penetration of wind generation causes the power networks to be more depend on, and vulnerable to, the varying wind speed. Modeling is a tool which can provide valuable information about the interaction between wind farms and the power network to which they are connected. This paper develops a realistic characterization of a wind generator. The wind generator model is incorporated into an algorithm to investigate its contribution to the stability of the power network in the time domain. The tool obtained is termed dynamic power flow. The wind generator model takes on account the wind speed and the reactive power consumption by induction generators. Dynamic power flow analysis is carried-out using real wind data at 10-minute time intervals collected for one meteorological station. The generation injected at one point into the network provides active power locally and is found to reduce global power losses. However, the power supplied is time-varying and causes fluctuations in voltage magnitude and power fl ows in transmission lines.

  10. Incorporating methane into ecological footprint analysis. A case study of Ireland

    International Nuclear Information System (INIS)

    Carbon dioxide (CO2) accounting is important to global ecological footprint analysis. However methane (CH4), with a global warming potential (GWP) 25 times that of CO2, should not be neglected as an environmental indicator for informed environmental management. While this is a significant component, the CH4 associated with imported embodied energy should also be included in national greenhouse gas (GHG) inventories. This study proposes an initial method for incorporating methane into ecological footprint analyses and hopes to inform future debate on its inclusion. In order to account for differences in methane intensities from exporting countries, methane intensities for OECD countries were calculated using emission and energy consumption estimates taken directly from National Inventory Reports (NIR), published in conjunction with the Intergovernmental Panel on Climate Change (IPCC). For other countries the methane intensities were estimated using energy balances published by the International Energy Association (IEA) and IPCC default emission factors. In order to estimate embodied organic methane, material imports and exports were translated into units (such as live animals) capable of conversion into methane emissions. A significant increase in Ireland's footprint results from the inclusion of the GWP of methane is included within the footprint calculation. (author)

  11. Whole-genome sequencing and analysis of the Malaysian cynomolgus macaque (Macaca fascicularis) genome

    OpenAIRE

    Higashino, Atsunori; Sakate, Ryuichi; Kameoka, Yosuke; Takahashi, Ichiro; Hirata, Makoto; Tanuma, Reiko; Masui, Tohru; Yasutomi, Yasuhiro; Osada, Naoki

    2012-01-01

    Background The genetic background of the cynomolgus macaque (Macaca fascicularis) is made complex by the high genetic diversity, population structure, and gene introgression from the closely related rhesus macaque (Macaca mulatta). Herein we report the whole-genome sequence of a Malaysian cynomolgus macaque male with more than 40-fold coverage, which was determined using a resequencing method based on the Indian rhesus macaque genome. Results We identified approximately 9.7 million single nuc...

  12. Tomato Functional Genomics Database: a comprehensive resource and analysis package for tomato functional genomics

    OpenAIRE

    Fei, Zhangjun; Joung, Je-Gun; Tang, Xuemei; Zheng, Yi; Huang, Mingyun; Lee, Je Min; McQuinn, Ryan; Tieman, Denise M.; Alba, Rob; Klee, Harry J.; Giovannoni, James J

    2010-01-01

    Tomato Functional Genomics Database (TFGD) provides a comprehensive resource to store, query, mine, analyze, visualize and integrate large-scale tomato functional genomics data sets. The database is functionally expanded from the previously described Tomato Expression Database by including metabolite profiles as well as large-scale tomato small RNA (sRNA) data sets. Computational pipelines have been developed to process microarray, metabolite and sRNA data sets archived in the database, respe...

  13. Group sparse canonical correlation analysis for genomic data integration

    OpenAIRE

    Lin, Dongdong; Zhang, Jigang; Li, Jingyao; Calhoun, Vince D.; Deng, Hong-Wen; Wang, Yu-Ping

    2013-01-01

    Background The emergence of high-throughput genomic datasets from different sources and platforms (e.g., gene expression, single nucleotide polymorphisms (SNP), and copy number variation (CNV)) has greatly enhanced our understandings of the interplay of these genomic factors as well as their influences on the complex diseases. It is challenging to explore the relationship between these different types of genomic data sets. In this paper, we focus on a multivariate statistical method, canonica...

  14. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes

    OpenAIRE

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R.; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomica...

  15. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby;

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome...... association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function....

  16. Analysis of the genome of leporid herpesvirus 4

    OpenAIRE

    Babra, Bobby; Watson, Gregory; Xu, Wayne; Jeffrey, Brendan; Xu, Jia-Rong; Rockey, Dan; Rohrmann, George; Jin, Ling

    2012-01-01

    The genome of a herpesvirus highly pathogenic to rabbits, leporid herpesvirus 4 (LHV-4), was analyzed using high-throughput DNA sequencing technology and primer walking. The assembled DNA sequences were further verified by restriction endonuclease digestion and Southern blot analyses. The total length of the LHV-4 genome was determined to be about 124 kb. Genes encoded in the LHV-4 genome are most closely related to herpesvirus of the Simplexvirus genus, including human herpesviruses (HHV -1 ...

  17. The Perennial Ryegrass GenomeZipper – Targeted Use of Genome Resources for Comparative Grass Genomics

    DEFF Research Database (Denmark)

    Pfeiffer, Matthias; Martis, Mihaela; Asp, Torben;

    2013-01-01

    (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to...... assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The...

  18. Genome-wide survey and analysis of microsatellites in the Pacific oyster genome: abundance, distribution, and potential for marker development

    Science.gov (United States)

    Wang, Jiafeng; Qi, Haigang; Li, Li; Zhang, Guofan

    2014-01-01

    Microsatellites are a ubiquitous component of the eukaryote genome and constitute one of the most popular sources of molecular markers for genetic studies. However, no data are currently available regarding microsatellites across the entire genome in oysters, despite their importance to the aquaculture industry. We present the first genome-wide investigation of microsatellites in the Pacific oyster Crassostrea gigas by analysis of the complete genome, resequencing, and expression data. The Pacific oyster genome is rich in microsatellites. A total of 604 653 repeats were identified, in average of one locus per 815 base pairs (bp). A total of 12 836 genes had coding repeats, and 7 332 were expressed normally, including genes with a wide range of molecular functions. Compared with 20 different species of animals, microsatellites in the oyster genome typically exhibited 1) an intermediate overall frequency; 2) relatively uniform contents of (A)n and (C)n repeats and abundant long (C)n repeats (≥24 bp); 3) large average length of (AG)n repeats; and 4) scarcity of trinucleotide repeats. The microsatellite-flanking regions exhibited a high degree of polymorphism with a heterozygosity rate of around 2.0%, but there was no correlation between heterozygosity and microsatellite abundance. A total of 19 462 polymorphic microsatellites were discovered, and dinucleotide repeats were the most active, with over 26% of loci found to harbor allelic variations. In all, 7 451 loci with high potential for marker development were identified. Better knowledge of the microsatellites in the oyster genome will provide information for the future design of a wide range of molecular markers and contribute to further advancements in the field of oyster genetics, particularly for molecular-based selection and breeding.

  19. CoCoNUT: an efficient system for the comparison and analysis of genomes

    Directory of Open Access Journals (Sweden)

    Kurtz Stefan

    2008-11-01

    Full Text Available Abstract Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit that allows solving several different tasks in a unified framework: (1 finding regions of high similarity among multiple genomic sequences and aligning them, (2 comparing two draft or multi-chromosomal genomes, (3 locating large segmental duplications in large genomic sequences, and (4 mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component, CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics.

  20. Sequence and comparative genomic analysis of actin-related proteins.

    Science.gov (United States)

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-12-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4. PMID:16195354

  1. Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms

    Directory of Open Access Journals (Sweden)

    Meller Jaroslaw

    2007-03-01

    Full Text Available Abstract Background Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. Results We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance. In particular, Cinteny provides: i integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii flexibility to adjust the parameters and re-compute the results on-the-fly; iii ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at http://cinteny.cchmc.org. Conclusion Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances

  2. Comparative genome analysis across a kingdom of eukaryotic organisms: Specialization and diversification in the Fungi

    OpenAIRE

    Cornell, Michael J.; Alam, Intikhab; Soanes, Darren M.; Wong, Han Min; Hedeler, Cornelia; Paton, Norman W; Rattray, Magnus; Hubbard, Simon J; Talbot, Nicholas J.; Oliver, Stephen G

    2007-01-01

    The recent proliferation of genome sequencing in diverse fungal species has provided the first opportunity for comparative genome analysis across a eukaryotic kingdom. Here, we report a comparative study of 34 complete fungal genome sequences, representing a broad diversity of Ascomycete, Basidiomycete, and Zygomycete species. We have clustered all predicted protein-encoding gene sequences from these species to provide a means of investigating gene innovations, gene family expansions, protein...

  3. Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer

    OpenAIRE

    Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber

    2012-01-01

    Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genet...

  4. Dating the age of admixture via wavelet transform analysis of genome-wide data.

    Science.gov (United States)

    Pugach, Irina; Matveyev, Rostislav; Wollstein, Andreas; Kayser, Manfred; Stoneking, Mark

    2011-01-01

    We describe a PCA-based genome scan approach to analyze genome-wide admixture structure, and introduce wavelet transform analysis as a method for estimating the time of admixture. We test the wavelet transform method with simulations and apply it to genome-wide SNP data from eight admixed human populations. The wavelet transform method offers better resolution than existing methods for dating admixture, and can be applied to either SNP or sequence data from humans or other species. PMID:21352535

  5. Dating the age of admixture via wavelet transform analysis of genome-wide data

    OpenAIRE

    Pugach, Irina; Matveyev, Rostislav; Wollstein, Andreas; Kayser, Manfred; Stoneking, Mark

    2011-01-01

    textabstractWe describe a PCA-based genome scan approach to analyze genome-wide admixture structure, and introduce wavelet transform analysis as a method for estimating the time of admixture. We test the wavelet transform method with simulations and apply it to genome-wide SNP data from eight admixed human populations. The wavelet transform method offers better resolution than existing methods for dating admixture, and can be applied to either SNP or sequence data from humans or other species.

  6. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    Science.gov (United States)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259–93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  7. caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data

    Directory of Open Access Journals (Sweden)

    Xuan Jianhua

    2008-09-01

    Full Text Available Abstract Background The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables. Results In an effort to partially address these limitations, we develop the VIsual Statistical Data Analyzer (VISDA for cluster modeling, visualization, and discovery in genomic data. VISDA performs progressive, coarse-to-fine (divisive hierarchical clustering and visualization, supported by hierarchical mixture modeling, supervised/unsupervised informative gene selection, supervised/unsupervised data visualization, and user/prior knowledge guidance, to discover hidden clusters within complex, high-dimensional genomic data. The hierarchical visualization and clustering scheme of VISDA uses multiple local visualization subspaces (one at each node of the hierarchy and consequent subspace data modeling to reveal both global and local cluster structures in a "divide and conquer" scenario. Multiple projection methods, each sensitive to a distinct type of clustering tendency, are used for data visualization, which increases the likelihood that cluster structures of interest are revealed. Initialization of the full dimensional model is based on first learning models with user/prior knowledge guidance on data projected into the low-dimensional visualization spaces. Model order selection for the high dimensional data is accomplished by Bayesian theoretic criteria and user justification applied via the hierarchy of low-dimensional visualization subspaces. Based on its complementary building blocks and flexible functionality, VISDA is generally applicable for gene clustering, sample

  8. Comparative genomic analysis of carbon and nitrogen assimilation mechanisms in three indigenous bioleaching bacteria: predictions and validations

    Directory of Open Access Journals (Sweden)

    Ehrenfeld Nicole

    2008-12-01

    Full Text Available Abstract Background Carbon and nitrogen fixation are essential pathways for autotrophic bacteria living in extreme environments. These bacteria can use carbon dioxide directly from the air as their sole carbon source and can use different sources of nitrogen such as ammonia, nitrate, nitrite, or even nitrogen from the air. To have a better understanding of how these processes occur and to determine how we can make them more efficient, a comparative genomic analysis of three bioleaching bacteria isolated from mine sites in Chile was performed. This study demonstrated that there are important differences in the carbon dioxide and nitrogen fixation mechanisms among bioleaching bacteria that coexist in mining environments. Results In this study, we probed that both Acidithiobacillus ferrooxidans and Acidithiobacillus thiooxidans incorporate CO2 via the Calvin-Benson-Bassham cycle; however, the former bacterium has two copies of the Rubisco type I gene whereas the latter has only one copy. In contrast, we demonstrated that Leptospirillum ferriphilum utilizes the reductive tricarboxylic acid cycle for carbon fixation. Although all the species analyzed in our study can incorporate ammonia by an ammonia transporter, we demonstrated that Acidithiobacillus thiooxidans could also assimilate nitrate and nitrite but only Acidithiobacillus ferrooxidans could fix nitrogen directly from the air. Conclusion The current study utilized genomic and molecular evidence to verify carbon and nitrogen fixation mechanisms for three bioleaching bacteria and provided an analysis of the potential regulatory pathways and functional networks that control carbon and nitrogen fixation in these microorganisms.

  9. Development and characterization of genomic and expressed SSRs in citrus by genome-wide analysis.

    Directory of Open Access Journals (Sweden)

    Sheng-Rui Liu

    Full Text Available Microsatellites or simple sequence repeats (SSRs are one of the most popular sources of genetic markers and play a significant role in plant genetics and breeding. In this study, we identified citrus SSRs in the genome of Clementine mandarin and analyzed their frequency and distribution in different genomic regions. A total of 80,708 SSRs were detected in the genome with an overall density of 268 SSRs/Mb. While di-nucleotide repeats were the most frequent microsatellites in genomic DNA sequence, tetra-nucleotides, which had more repeat units than any other SSR types, had the highest cumulative sequence length. We identified 6,834 transcripts as containing 8,989 SSRs in 33,929 Clementine mandarin transcripts, among which, tri-nucleotide motifs (36.0% were the most common, followed by di-nucleotide (26.9% and hexa-nucleotide motifs (15.1%. The motif AG (16.7% was most abundant among these SSRs, while motifs AAG (6.6%, AAT (5.0%, and TAG (2.2% were most common among tri-nucleotides. Functional categorization of transcripts containing SSRs revealed that 5,879 (86.0% of such transcripts had homology with known proteins, GO and KEGG annotation revealed that transcripts containing SSRs were those implicated in diverse biological processes in plants, including binding, development, transcription, and protein degradation. When 27 genomic and 78 randomly selected SSRs were tested on Clementine mandarin, 95 SSRs revealed polymorphism. These 95 SSRs were further deployed on 18 genotypes of the three generas of Rutaceae for the genetic diversity assessment, genomic SSRs generally show low transferability in comparison to SSRs developed from expressed sequences. These transcript-markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in citrus, such as diversity study, QTL mapping, molecular breeding, comparative mapping and other genetic analyses.

  10. Full-length genomic analysis of korean porcine sapelovirus strains

    DEFF Research Database (Denmark)

    Son, Kyu-Yeol; Kim, Deok-Song; Kwon, Joseph;

    2014-01-01

    structural features of PSV genomes, the full-length nucleotide sequences of three Korean PSV strains were determined and analyzed using bioinformatic techniques in comparison with other known PSV strains. The Korean PSV genomes ranged from 7,542 to 7,566 nucleotides excluding the 3' poly(A) tail, and showed...

  11. Analysis of copy number variation in the bovine genome

    Science.gov (United States)

    We initiated a systematic study of the copy number variation (CNV) within the Bovine HapMap cattle population using array comparative genomic hybridization (array CGH). Oligonucleotide CGH arrays were designed and fabricated to provide a genome-wide coverage with an average interval of 6 kb using t...

  12. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D' Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  13. A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease

    Science.gov (United States)

    Kyriakou, Theodosios; Nelson, Christopher P; Hopewell, Jemma C; Webb, Thomas R; Zeng, Lingyao; Dehghan, Abbas; Alver, Maris; Armasu, Sebastian M; Auro, Kirsi; Bjonnes, Andrew; Chasman, Daniel I; Chen, Shufeng; Ford, Ian; Franceschini, Nora; Gieger, Christian; Grace, Christopher; Gustafsson, Stefan; Huang, Jie; Hwang, Shih-Jen; Kim, Yun Kyoung; Kleber, Marcus E; Lau, King Wai; Lu, Xiangfeng; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Mihailov, Evelin; Morrison, Alanna C; Pervjakova, Natalia; Qu, Liming; Rose, Lynda M; Salfati, Elias; Saxena, Richa; Scholz, Markus; Smith, Albert V; Tikkanen, Emmi; Uitterlinden, Andre; Yang, Xueli; Zhang, Weihua; Zhao, Wei; de Andrade, Mariza; de Vries, Paul S; van Zuydam, Natalie R; Anand, Sonia S; Bertram, Lars; Beutner, Frank; Dedoussis, George; Frossard, Philippe; Gauguier, Dominique; Goodall, Alison H; Gottesman, Omri; Haber, Marc; Han, Bok-Ghee; Huang, Jianfeng; Jalilzadeh, Shapour; Kessler, Thorsten; König, Inke R; Lannfelt, Lars; Lieb, Wolfgang; Lind, Lars; Lindgren, Cecilia M; Lokki, Marja-Liisa; Magnusson, Patrik K; Mallick, Nadeem H; Mehra, Narinder; Meitinger, Thomas; Memon, Fazal-ur-Rehman; Morris, Andrew P; Nieminen, Markku S; Pedersen, Nancy L; Peters, Annette; Rallidis, Loukianos S; Rasheed, Asif; Samuel, Maria; Shah, Svati H; Sinisalo, Juha; Stirrups, Kathleen E; Trompet, Stella; Wang, Laiyuan; Zaman, Khan S; Ardissino, Diego; Boerwinkle, Eric; Borecki, Ingrid B; Bottinger, Erwin P; Buring, Julie E; Chambers, John C; Collins, Rory; Cupples, L Adrienne; Danesh, John; Demuth, Ilja; Elosua, Roberto; Epstein, Stephen E; Esko, Tõnu; Feitosa, Mary F; Franco, Oscar H; Franzosi, Maria Grazia; Granger, Christopher B; Gu, Dongfeng; Gudnason, Vilmundur; Hall, Alistair S; Hamsten, Anders; Harris, Tamara B; Hazen, Stanley L; Hengstenberg, Christian; Hofman, Albert; Ingelsson, Erik; Iribarren, Carlos; Jukema, J Wouter; Karhunen, Pekka J; Kim, Bong-Jo; Kooner, Jaspal S; Kullo, Iftikhar J; Lehtimäki, Terho; Loos, Ruth J F; Melander, Olle; Metspalu, Andres; März, Winfried; Palmer, Colin N; Perola, Markus; Quertermous, Thomas; Rader, Daniel J; Ridker, Paul M; Ripatti, Samuli; Roberts, Robert; Salomaa, Veikko; Sanghera, Dharambir K; Schwartz, Stephen M; Seedorf, Udo; Stewart, Alexandre F; Stott, David J; Thiery, Joachim; Zalloua, Pierre A; O’Donnell, Christopher J; Reilly, Muredach P; Assimes, Themistocles L; Thompson, John R; Erdmann, Jeanette; Clarke, Robert; Watkins, Hugh; Kathiresan, Sekar; McPherson, Ruth; Deloukas, Panos; Schunkert, Heribert; Samani, Nilesh J; Farrall, Martin

    2015-01-01

    Existing knowledge of genetic variants affecting risk of coronary artery disease (CAD) is largely based on genome-wide association studies (GWAS) analysis of common SNPs. Leveraging phased haplotypes from the 1000 Genomes Project, we report a GWAS meta-analysis of 185 thousand CAD cases and controls, interrogating 6.7 million common (MAF>0.05) as well as 2.7 million low frequency (0.005analysis provides a comprehensive survey of the fine genetic architecture of CAD showing that genetic susceptibility to this common disease is largely determined by common SNPs of small effect size. PMID:26343387

  14. In silico comparative genomic analysis of GABAA receptor transcriptional regulation

    Directory of Open Access Journals (Sweden)

    Joyce Christopher J

    2007-06-01

    Full Text Available Abstract Background Subtypes of the GABAA receptor subunit exhibit diverse temporal and spatial expression patterns. In silico comparative analysis was used to predict transcriptional regulatory features in individual mammalian GABAA receptor subunit genes, and to identify potential transcriptional regulatory components involved in the coordinate regulation of the GABAA receptor gene clusters. Results Previously unreported putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunit genes. Putative core elements and proximal transcriptional factors were identified within these predicted promoters, and within the experimentally determined promoters of other subunit genes. Conserved intergenic regions of sequence in the mammalian GABAA receptor gene cluster comprising the α1, β2, γ2 and α6 subunits were identified as potential long range transcriptional regulatory components involved in the coordinate regulation of these genes. A region of predicted DNase I hypersensitive sites within the cluster may contain transcriptional regulatory features coordinating gene expression. A novel model is proposed for the coordinate control of the gene cluster and parallel expression of the α1 and β2 subunits, based upon the selective action of putative Scaffold/Matrix Attachment Regions (S/MARs. Conclusion The putative regulatory features identified by genomic analysis of GABAA receptor genes were substantiated by cross-species comparative analysis and now require experimental verification. The proposed model for the coordinate regulation of genes in the cluster accounts for the head-to-head orientation and parallel expression of the α1 and β2 subunit genes, and for the disruption of transcription caused by insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a putative critical S/MAR.

  15. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU

    Directory of Open Access Journals (Sweden)

    Ruibang Luo

    2014-06-01

    Full Text Available This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels, BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads, or just 25 min for 210-fold whole exome sequencing. BALSA’s speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  16. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    Institute of Scientific and Technical Information of China (English)

    YANG Hui; Zhang YaPing

    2007-01-01

    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  17. Comparative Analysis of Fatty Acid Desaturases in Cyanobacterial Genomes

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    2008-01-01

    Full Text Available Fatty acid desaturases are enzymes that introduce double bonds into the hydrocarbon chains of fatty acids. The fatty acid desaturases from 37 cyanobacterial genomes were identified and classified based upon their conserved histidine-rich motifs and phylogenetic analysis, which help to determine the amounts and distributions of desaturases in cyanobacterial species. The filamentous or N2-fixing cyanobacteria usually possess more types of fatty acid desaturases than that of unicellular species. The pathway of acyl-lipid desaturation for unicellular marine cyanobacteria Synechococcus and Prochlorococcus differs from that of other cyanobacteria, indicating different phylogenetic histories of the two genera from other cyanobacteria isolated from freshwater, soil, or symbiont. Strain Gloeobacter violaceus PCC 7421 was isolated from calcareous rock and lacks thylakoid membranes. The types and amounts of desaturases of this strain are distinct to those of other cyanobacteria, reflecting the earliest divergence of it from the cyanobacterial line. Three thermophilic unicellular strains, Thermosynechococcus elongatus BP-1 and two Synechococcus Yellowstone species, lack highly unsaturated fatty acids in lipids and contain only one Δ9 desaturase in contrast with mesophilic strains, which is probably due to their thermic habitats. Thus, the amounts and types of fatty acid desaturases are various among different cyanobacterial species, which may result from the adaption to environments in evolution.

  18. Genome-wide analysis of TCP family in tobacco.

    Science.gov (United States)

    Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H

    2016-01-01

    The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco. PMID:27323069

  19. Functional genomic analysis of cassava proteins with TIR domains

    International Nuclear Information System (INIS)

    Proteins containing a TIR domain (toll interleukin receptor) are involved in plant and animal immunity. The aim of this work was to carry out an overall genomic analysis of cassava proteins with a TIR domain and discern their possible role in resistance to cassava bacterial blight. In total 46 proteins with a TIR domain were identified in the cassava proteome and were classed in four categories according the presence or absence of other domains: TIR (T), TIR -NB (TN), TIR - lRR (TL) and TIR - NB - lRR (TNL). 56.6 % of these 46 proteins have TIR, NB and lRR domains. Using multiple alignments it was possible to demonstrate that not all cassava TIR domains contain the AE region, involved in dimerization and activation of immune responses. Three of the four proteins categories (T, TNL and TN) presented a higher number of synonymous substitutions suggesting that they are not involved in recognition process. two TIR domains not presenting the ae region were analyzed by yeast two hybrid assays and by agro-infiltration, finding that both are able to form homo and heterodimers, but they do not trigger defense responses. With this study it was possible to conclude that TIR domains can function as adaptors in the signal transduction with other resistance proteins. In addition, it became clear that not always the AE region is important for TIR dimerization but it seems necessary to activate defense responses signals.

  20. Connecting Genomic Alterations to Cancer Biology with Proteomics: The NCI Clinical Proteomic Tumor Analysis Consortium

    Energy Technology Data Exchange (ETDEWEB)

    Ellis, Matthew; Gillette, Michael; Carr, Steven A.; Paulovich, Amanda G.; Smith, Richard D.; Rodland, Karin D.; Townsend, Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel

    2013-10-03

    The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verifi cation using targeted mass spectrometry methods.

  1. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

    DEFF Research Database (Denmark)

    Minster, Ryan L; Sanders, Jason L; Singh, Jatinder;

    2015-01-01

    BACKGROUND: The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. METHODS: We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for......: There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10(-) (6)) and nominally replicated in the Cardiovascular...

  2. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    Science.gov (United States)

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  3. Genome Sequence Analysis of Mycoplasma sp. HU2014, Isolated from Tissue Culture

    OpenAIRE

    Calcutt, Michael J.; Szikriszt, Bernadett; Póti, Ádám; Molnár, János; Gervai, Judit Z.; Tusnády, Gábor E.; Foecking, Mark F.; Szüts, Dávid

    2015-01-01

    The draft genome sequence of a novel Mycoplasma strain, designated Mycoplasma sp. HU2014, has been determined. The genome comprises 1,084,927 nucleotides and was obtained from a mycoplasma-infected culture of chicken DT40 cells. Phylogenetic analysis places this taxon in a group comprising the closely related species Mycoplasma yeatsii and Mycoplasma cottewii.

  4. Dissection of genomic correlation matrices of US Holsteins using multivariate factor analysis

    Science.gov (United States)

    Aim of the study was to compare correlation matrices between direct genomic predictions for 31 production, fitness and conformation traits both at genomic and chromosomal level in US Holstein bulls. Multivariate factor analysis was used to quantify basic features of correlation matrices. Factor extr...

  5. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    Directory of Open Access Journals (Sweden)

    Fowler Katie E

    2009-08-01

    Full Text Available Abstract Background The availability of the complete chicken (Gallus gallus genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo and the first analysis of copy number variants (CNVs in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos, an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots". Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies.

  6. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  7. Secure distributed genome analysis for GWAS and sequence comparison computation

    Science.gov (United States)

    2015-01-01

    Background The rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center. Methods In this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-squared statistics computation, as well as distance computation between two genomic sequences, as specified by the iDASH competition tasks. We put forward novel optimizations, including a generalization of a version of mergesort, which might be of independent interest. Results We provide implementation results of our techniques based on secret sharing that demonstrate practicality of the suggested protocols and also report on performance improvements due to our optimization techniques. Conclusions This work describes our techniques, findings, and experimental results developed and obtained as part of iDASH 2015 research competition to secure real-life genomic computations and shows feasibility of securely computing with genomic data in practice. PMID:26733307

  8. Carotenoid biosynthetic genes in Brassica rapa: comparative genomic analysis, phylogenetic analysis, and expression profiling

    OpenAIRE

    Li, Peirong; Zhang, Shujiang; Zhang, Shifan; Li, Fei; Zhang, Hui; Cheng, Feng; Wu, Jian; Wang, Xiaowu; Sun, Rifei

    2015-01-01

    Background Carotenoids are isoprenoid compounds synthesized by all photosynthetic organisms. Despite much research on carotenoid biosynthesis in the model plant Arabidopsis thaliana, there is a lack of information on the carotenoid pathway in Brassica rapa. To better understand its carotenoid biosynthetic pathway, we performed a systematic analysis of carotenoid biosynthetic genes at the genome level in B. rapa. Results We identified 67 carotenoid biosynthetic genes in B. rapa, which were ort...

  9. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  10. SmashCell: A software framework for the analysis of single-cell amplified genome sequences

    DEFF Research Database (Denmark)

    Harrington, Eoghan D; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer; Relman, David a

    2010-01-01

    SUMMARY: Recent advances in single-cell manipulation technology, whole genome amplification and high-throughput sequencing have now made it possible to sequence the genome of an individual cell. The bioinformatic analysis of these genomes however is far more complicated than the analysis of those...... - in a way that allows parameter and algorithm exploration at each step in the process. It alsomanages the data created by these analyses and provides visualisation methods to allow rapid analysis of the results. AVAILABILITY: The SmashCell source code and a comprehensive manual are available at http...

  11. An Alternative Methodological Approach for Cost-Effectiveness Analysis and Decision Making in Genomic Medicine.

    Science.gov (United States)

    Fragoulakis, Vasilios; Mitropoulou, Christina; van Schaik, Ron H; Maniadakis, Nikolaos; Patrinos, George P

    2016-05-01

    Genomic Medicine aims to improve therapeutic interventions and diagnostics, the quality of life of patients, but also to rationalize healthcare costs. To reach this goal, careful assessment and identification of evidence gaps for public health genomics priorities are required so that a more efficient healthcare environment is created. Here, we propose a public health genomics-driven approach to adjust the classical healthcare decision making process with an alternative methodological approach of cost-effectiveness analysis, which is particularly helpful for genomic medicine interventions. By combining classical cost-effectiveness analysis with budget constraints, social preferences, and patient ethics, we demonstrate the application of this model, the Genome Economics Model (GEM), based on a previously reported genome-guided intervention from a developing country environment. The model and the attendant rationale provide a practical guide by which all major healthcare stakeholders could ensure the sustainability of funding for genome-guided interventions, their adoption and coverage by health insurance funds, and prioritization of Genomic Medicine research, development, and innovation, given the restriction of budgets, particularly in developing countries and low-income healthcare settings in developed countries. The implications of the GEM for the policy makers interested in Genomic Medicine and new health technology and innovation assessment are also discussed. PMID:27096406

  12. CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb

    Directory of Open Access Journals (Sweden)

    Mahadevan Padmanabhan

    2009-08-01

    Full Text Available Abstract Background Viruses and small-genome bacteria (~2 megabases and smaller comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. Findings CGUG is available at http://binf.gmu.edu/geneorder.html as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. Conclusion CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins.

  13. Comparative Genome Analysis of Lolium-Festuca Complex Species

    DEFF Research Database (Denmark)

    Czaban, Adrian; Byrne, Stephen; Sharma, Sapna;

    2015-01-01

    , winter hardiness, drought tolerance and resistance to grazing. In this study we have sequenced and assembled the low copy fraction of the genomes of Lolium westerwoldicum, Lolium multiflorum, Festuca pratensis and Lolium temulentum. We have also generated de-novo transcriptome assemblies for each species......, and these have aided in the annotation of the genomic sequence. Using this data we were able to generate annotated assemblies of the gene rich regions of the four species to complement the already sequenced Lolium perenne genome. Using these gene models we have identified orthologous genes between the species...

  14. Analysis of incorporation cadmium and chromium VI by yeast strains using neutron activation

    International Nuclear Information System (INIS)

    Industrial and agricultural activities discharge in environment metals as Cd, Cu, Ni, Co, Zn, Pb and Cr. These metals pollute the environment, and contaminate foods which, when consumed, can cause damage to animals and human health. Common physic-chemical processes are not suitable for remediation of diluted effluents (concentration of metals until 100 ppm) because of their high cost. However bioremediation could be can an alternative for treatment these effluents. The success of bioremediation process involves the technology improve, based on a raw abundant, low cost and, effective material. Biological organisms remove metals through of two processes: Bioaccumulation (depend of metabolism to incorporate metal, therefore, use living cells) and Biosorption (adhesion of metals in compounds present in surface, therefore, cells can be dead). The objective of this research is to evaluate if an isolated yeast of cachaca fermentation can be used for metal capture. Thus, we compared the capacity of this yeast dead or live, to incorporate cadmium and chromium VI. We used the neutron activation technique to determine the concentration of the metal incorporated by the cells. The neutron activation was an easy, rapid and suitable technique to do these metal determinations in yeast cells. Living organisms need trace quantities of metals essential (Mg, Fe, Cu, Mn, Zn), but some heavy metals, as cadmium and mercury, don't have any biological function. Heavy metals incorporation can cause cell damage such as oxidative stress. We evaluated one stress oxidative markers in live yeast which incorporated metal: peroxidation of lipid. (author)

  15. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes.

    Science.gov (United States)

    Iranzo, Jaime; Gómez, Manuel J; López de Saro, Francisco J; Manrubia, Susanna

    2014-06-01

    Insertion sequences (IS) are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events-punctuations-during which the state of coexistence of IS and host becomes perturbated. PMID:24967627

  16. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Jaime Iranzo

    2014-06-01

    Full Text Available Insertion sequences (IS are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events-punctuations-during which the state of coexistence of IS and host becomes perturbated.

  17. Comparative Genome Analysis Reveals Divergent Genome Size Evolution in a Carnivorous Plant Genus

    Czech Academy of Sciences Publication Activity Database

    Vu, G.T.H.; Schmutzer, T.; Bull, F.; Cao, H.X.; Fuchs, J.; Tran, T.D.; Jovtchev, G.; Pistrick, K.; Stein, N.; Pečinka, A.; Neumann, Pavel; Novák, Petr; Macas, Jiří; Dear, P.H.; Blattner, F.R.; Scholz, U.; Schubert, I.

    2015-01-01

    Roč. 8, č. 3 (2015). ISSN 1940-3372 R&D Projects: GA ČR GBP501/12/G090 Institutional support: RVO:60077344 Keywords : Genlisea * genome * repetitive sequences Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.933, year: 2014

  18. A reference genome for common bean and genome wide analysis of dual domestications

    Science.gov (United States)

    Common bean (Phaseolus vulgaris) is the single most important grain legume for human consumption and, due to its ability to fix atmospheric nitrogen via symbioses with soil-borne microorganisms, has a valuable place in sustainable agriculture. We assembled 473 Mb of the common bean genome and geneti...

  19. Analysis of high-identity segmental duplications in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Carelli Francesco N

    2011-08-01

    Full Text Available Abstract Background Segmental duplications (SDs are blocks of genomic sequence of 1-200 kb that map to different loci in a genome and share a sequence identity > 90%. SDs show at the sequence level the same characteristics as other regions of the human genome: they contain both high-copy repeats and gene sequences. SDs play an important role in genome plasticity by creating new genes and modeling genome structure. Although data is plentiful for mammals, not much was known about the representation of SDs in plant genomes. In this regard, we performed a genome-wide analysis of high-identity SDs on the sequenced grapevine (Vitis vinifera genome (PN40024. Results We demonstrate that recent SDs (> 94% identity and >= 10 kb in size are a relevant component of the grapevine genome (85 Mb, 17% of the genome sequence. We detected mitochondrial and plastid DNA and genes (10% of gene annotation in segmentally duplicated regions of the nuclear genome. In particular, the nine highest copy number genes have a copy in either or both organelle genomes. Further we showed that several duplicated genes take part in the biosynthesis of compounds involved in plant response to environmental stress. Conclusions These data show the great influence of SDs and organelle DNA transfers in modeling the Vitis vinifera nuclear DNA structure as well as the impact of SDs in contributing to the adaptive capacity of grapevine and the nutritional content of grape products through genome variation. This study represents a step forward in the full characterization of duplicated genes important for grapevine cultural needs and human health.

  20. Pyrosequencing-based comparative genome analysis of the nosocomial pathogen Enterococcus faecium and identification of a large transferable pathogenicity island

    Directory of Open Access Journals (Sweden)

    Bonten Marc JM

    2010-04-01

    Full Text Available Abstract Background The Gram-positive bacterium Enterococcus faecium is an important cause of nosocomial infections in immunocompromized patients. Results We present a pyrosequencing-based comparative genome analysis of seven E. faecium strains that were isolated from various sources. In the genomes of clinical isolates several antibiotic resistance genes were identified, including the vanA transposon that confers resistance to vancomycin in two strains. A functional comparison between E. faecium and the related opportunistic pathogen E. faecalis based on differences in the presence of protein families, revealed divergence in plant carbohydrate metabolic pathways and oxidative stress defense mechanisms. The E. faecium pan-genome was estimated to be essentially unlimited in size, indicating that E. faecium can efficiently acquire and incorporate exogenous DNA in its gene pool. One of the most prominent sources of genomic diversity consists of bacteriophages that have integrated in the genome. The CRISPR-Cas system, which contributes to immunity against bacteriophage infection in prokaryotes, is not present in the sequenced strains. Three sequenced isolates carry the esp gene, which is involved in urinary tract infections and biofilm formation. The esp gene is located on a large pathogenicity island (PAI, which is between 64 and 104 kb in size. Conjugation experiments showed that the entire esp PAI can be transferred horizontally and inserts in a site-specific manner. Conclusions Genes involved in environmental persistence, colonization and virulence can easily be aquired by E. faecium. This will make the development of successful treatment strategies targeted against this organism a challenge for years to come.

  1. Analysis by X-Ray images of EVA waste incorporated in Portland Cement

    International Nuclear Information System (INIS)

    The EVA is a copolymer used by Brazilian shoes industries. This material is cut for the manufacture of insoles. This operation generates about 18% of waste. The EVA waste can be reused in incorporation in Portland cement to construction without structural purposes. The aim of this work is to show X-rays images to assessment the space distribution of the wastes in the cement and to evaluate the use of this methodology. Cylindrical specimens were produced according to ABNT - NBR 5738 standards. The volume relation of sand and cement was 3:1, 10% and 30% of waste was incorporated in cement specimens. X-Rays images were obtained of cylindrical specimens in front projection. The images showed that the distribution of the waste is homogeneous, consistent with what was intended in this type of incorporation, which can provide uniformity in test results of compressive strength. (author)

  2. Microsporidian genome analysis reveals evolutionary strategies for obligate intracellular growth

    Science.gov (United States)

    Microsporidia comprise a large phylum of obligate intracellular eukaryotes that are fungalrelated parasites responsible for widespread disease, and here we address questions about microsporidia biology and evolution. We sequenced three microsporidian genomes from two species, Nematocida parisii and...

  3. Sequence analysis of the complete mitochondrial genome of Youxian sheldrake.

    Science.gov (United States)

    He, Shao-Ping; Liu, Li-Li; Yu, Qi-Fang; Li, Si; He, Jian-Hua

    2016-01-01

    Youxian sheldrake is excellent native breeds in Hunan province in China. The complete mitochondrial (mt) genome sequence plays an important role in the accurate determination of phylogenetic relationships among metazoans. This is the first study to determine the complete mitochondrial genome sequence of Youxian sheldrake using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, the total length of the mitogenome is 16,605 bp, with the base composition of 29.21% A, 22.18% T, 32.84% C, 15.77% G in the Youxian sheldrake. It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of Youxian sheldrake provided an important data for further study of the phylogenetics of poultry, and available data for the genetics and breeding. PMID:25090395

  4. Integrated proteomic and genomic analysis of colorectal cancer

    Science.gov (United States)

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  5. BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment

    Science.gov (United States)

    Boel, Annekatrien; Steyaert, Woutert; De Rocker, Nina; Menten, Björn; Callewaert, Bert; De Paepe, Anne; Coucke, Paul; Willaert, Andy

    2016-01-01

    Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome. PMID:27461955

  6. BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment.

    Science.gov (United States)

    Boel, Annekatrien; Steyaert, Woutert; De Rocker, Nina; Menten, Björn; Callewaert, Bert; De Paepe, Anne; Coucke, Paul; Willaert, Andy

    2016-01-01

    Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome. PMID:27461955

  7. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  8. Rapid mass spectrometric analysis of 15N-Leu incorporation fidelity during preparation of specifically labeled NMR samples

    DEFF Research Database (Denmark)

    Truhlar, Stephanie M E; Cervantes, Carla F; Torpey, Justin W;

    2008-01-01

    analyzing the isotopic abundance of the peptides in the mass spectra using the program DEX. This analysis determined that expression with a 10-fold excess of unlabeled amino acids relative to the (15)N-amino acid prevents the scrambling of the (15)N label that is observed when equimolar amounts are used......Advances in NMR spectroscopy have enabled the study of larger proteins that typically have significant overlap in their spectra. Specific (15)N-amino acid incorporation is a powerful tool for reducing spectral overlap and attaining reliable sequential assignments. However, scrambling of the label...... during protein expression is a common problem. We describe a rapid method to evaluate the fidelity of specific (15)N-amino acid incorporation. The selectively labeled protein is proteolyzed, and the resulting peptides are analyzed using MALDI mass spectrometry. The (15)N incorporation is determined by...

  9. Genome sequence and comparative analysis of Avibacterium paragallinarum

    OpenAIRE

    Requena, David; Chumbe, Ana; Torres, Michael; Alzamora, Ofelia; Ramirez, Manuel; Valdivia-Olarte, Hugo; Gutierrez, Andres Hazaet; Izquierdo-Lara, Ray; Saravia, Luis Enrique; Zavaleta, Milagros; Tataje-Lavanda, Luis; Best, Ivan; Fernández-Sánchez, Manolo; Icochea, Eliana; Zimic, Mirko

    2013-01-01

    Background: Avibacterium paragallinarum, the causative agent of infectious coryza, is a highly contagious respiratory acute disease of poultry, which affects commercial chickens, laying hens and broilers worldwide. Methodology: In this study, we performed the whole genome sequencing, assembly and annotation of a Peruvian isolate of A. paragallinarum. Genome was sequenced in a 454 GS FLX Titanium system. De novo assembly was performed and annotation was completed with GS De Novo Assembler 2.6 ...

  10. Genome-Scale Thermodynamic Analysis of Escherichia coli Metabolism

    OpenAIRE

    Christopher S Henry; Jankowski, Matthew D.; Broadbelt, Linda J.; Hatzimanikatis, Vassily

    2005-01-01

    Genome-scale metabolic models are an invaluable tool for analyzing metabolic systems as they provide a more complete picture of the processes of metabolism. We have constructed a genome-scale metabolic model of Escherichia coli based on the iJR904 model developed by the Palsson Laboratory at the University of California at San Diego. Group contribution methods were utilized to estimate the standard Gibbs free energy change of every reaction in the constructed model. Reactions in the model wer...

  11. Systems-Level Analysis of Genome-Wide Association Data

    OpenAIRE

    Farber, Charles R

    2013-01-01

    Genome-wide association studies (GWAS) have emerged as the method of choice for identifying common variants affecting complex disease. In a GWAS, particular attention is placed, for obvious reasons, on single-nucleotide polymorphisms (SNPs) that exceed stringent genome-wide significance thresholds. However, it is expected that many SNPs with only nominal evidence of association (e.g., P < 0.05) truly influence disease. Efforts to extract additional biological information from entire GWAS data...

  12. Phenome-wide analysis of genome-wide polygenic scores

    OpenAIRE

    Krapohl, E; Euesden, J.; Zabaneh, D.; Pingault, J-B; Rimfeld, K; von Stumm, Sophie; Dale, P.S.; Breen, G.; O'Reilly, P. F.; Plomin, R

    2015-01-01

    Genome-wide polygenic scores (GPS), which aggregate the effects of thousands of DNA variants from genome-wide association studies (GWAS), have the potential to make genetic predictions for individuals. We conducted a systematic investigation of associations between GPS and many behavioral traits, the behavioral phenome. For 3152 unrelated 16-year-old individuals representative of the United Kingdom, we created 13 GPS from the largest GWAS for psychiatric disorders (for example, schizophrenia,...

  13. Structural Characterization of Genomes by Large Scale Sequence-Structure Threading: Application of Reliability Analysis in Structural Genomics

    OpenAIRE

    Brunham Robert C; Ho Sui Shannan J; Cherkasov Artem; Jones Steven JM

    2004-01-01

    Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull f...

  14. Analysis of human accelerated DNA regions using archaic hominin genomes.

    Science.gov (United States)

    Burbano, Hernán A; Green, Richard E; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  15. Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor; Kohler, Annegret; Martin, Francis

    2013-03-08

    Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012 alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.

  16. Analysis of the genome of leporid herpesvirus 4.

    Science.gov (United States)

    Babra, Bobby; Watson, Gregory; Xu, Wayne; Jeffrey, Brendan M; Xu, Jia-Rong; Rockey, Daniel D; Rohrmann, George F; Jin, Ling

    2012-11-10

    The genome of a herpesvirus highly pathogenic to rabbits, leporid herpesvirus 4 (LHV-4), was analyzed using high-throughput DNA sequencing technology and primer walking. The assembled DNA sequences were further verified by restriction endonuclease digestion and Southern blot analyses. The total length of the LHV-4 genome was determined to be about 124 kb. Genes encoded in the LHV-4 genome are most closely related to herpesvirus of the Simplexvirus genus, including human herpesviruses (HHV-1 and HHV-2), monkey herpesviruses including cercopithicine (CeHV-2 and CeHV-16), macacine (McHV-1), bovine herpesvirus 2 (BHV-2), and a lineage of wallaby (macropodid) herpesviruses (MaHV-1 and -2). Similar to other simplexvirus genomes, LHV-4 has a high overall G+C content of 65-70% in the unique regions and 75-77% in the inverted repeat regions. Orthologs of ICP34.5 and US5 were not identified in the LHV-4 genome. This study shows that LHV-4 has the smallest simplexvirus genome characterized to date. PMID:22921533

  17. Integrative analysis of transcriptome and genome indicates two potential genomic islands are associated with pathogenesis of Mycobacterium tuberculosis.

    Science.gov (United States)

    Yu, Guohua; Fu, Xuping; Jin, Ke; Zhang, Lu; Wu, Wei; Cui, Zhenling; Hu, Zhongyi; Li, Yao

    2011-12-01

    Mycobacterium tuberculosis (M.tb) is a successful human pathogen and widely prevalent throughout the world. Genomic islands (GIs) are thought to be related to pathogenicity. In this study, we predicted two potential genomic islands in M.tb genome, respectively named as GI-1 and GI-2. It is indicated that the genes belong to PE_PGRS family in GI-1 and genes involved in sulfolipid-1 (SL-1) synthesis in GI-2 are strongly associated with M.tb pathogenesis. Sequence analysis revealed that the five PGRS genes are more polymorphic than other PGRS members in full virulence M.tb complex strains at significance level 0.01 but not in attenuated strains. Expression analysis of microarrays collected from literatures displayed that GI-1 genes, especially Rv3508 might be correlated with the response to the inhibition of aerobic respiration. Microarray analysis also showed that SL-1 cluster genes are drastically down-expressed in attenuated strains relative to full virulence strains. We speculated that the effect of SL-1 on M.tb pathogenicity could be associated with long-term survival and persistence establishment during infection. Additionally, the gene Rv3508 in GI-1 was under positive selection. Rv3508 may involve the response of M.tb to the inhibition of aerobic respiration by low oxygen or drug PA-824, and it may be a common feature of genes in GI-1. These findings may provide some novel insights into M.tb physiology and pathogenesis. PMID:21924330

  18. Flow cytometric analysis of oil palm: a preliminary analysis for cultivars and genomic DNA alteration

    Directory of Open Access Journals (Sweden)

    Warawut Chuthammathat

    2005-12-01

    Full Text Available DNA contents of oil palm (Elaeis guineensis Jacq. cultivars were analyzed by flow cytometry using different external reference plant species. Analysis using corn (Zea mays line CE-777 as a reference plant gave the highest DNA content of oil palm (4.72±0.23 pg 2C-1 whereas the DNA content was found to be lower when using soybean (Glycine max cv. Polanka (3.77±0.09 pg 2C-1 or tomato (Lycopersicon esculentum cv. Stupicke (4.25±0.09 pg 2C-1 as a reference. The nuclear DNA contents of Dura (D109, Pisifera (P168 and Tenera (T38 cultivars were 3.46±0.04, 3.24±0.03 and 3.76±0.04 pg 2C-1 nuclei, respectively, using soybean as a reference. One haploid genome of oil palm therefore ranged from 1.56 to 1.81±109 base pairs. DNA contents from one-year-old calli and cell suspension of oil palm were found to be significantly different from those of seedlings. It thus should be noted that genomic DNA alteration occurred in these cultured tissues. We therefore confirm that flow cytometric analysis could verify cultivars, DNA content and genomic DNA alteration of oil palm using soybean as an external reference standard.

  19. Comparative genomic analysis of mitochondrial protein-coding genes in Veneroida clams: Analysis of superfamily-specific genomic and evolutionary features.

    Science.gov (United States)

    Hwang, Jae Yeon; Lee, Chang-Kyu; Kim, Heebal; Nam, Bo-Hye; An, Cheul Min; Park, Jung Youn; Park, Kyu-Hyun; Huh, Chul-Sung; Kim, Eun Bae

    2015-12-01

    Veneroida is the largest order of bivalves, and these clams are commercially important in Asian countries. Although numerous studies have focused on the genomic characters of individual species or genera in Veneroida, superfamily-specific genomic characters have not been determined. In this study, we performed a comparative genomic analysis of 12 mitochondrial protein coding genes (PCGs) from 25 clams in six Veneroida superfamilies to determine genomic and evolutionary features of each superfamily. Length and distribution of nucleotides encoding the PCGs were too variable to define superfamily-specific genomic characters. Phylogenetic analysis revealed that PCGs are suitable for classification of species in three superfamilies: Cardioidea, Mactroidea, and Veneroidea. However, one species classified in Tellinoidea, Sinonovacula constricta, was evolutionarily closer to Solenoidea clams than Tellinoidea clams. dN/dS analysis showed that positively selected sites in NADH dehydrogenase subunit, nd4 and subunit of ATP synthase, atp6 were present in Mactroidea. Differences in selected sites in the nd4 and atp6 could be caused by superfamily-level differences in sodium transport or ATP synthesis functions, respectively. These differences in selected sites in NADH may have conferred these animals, which have low motility and do not generally move, with increased flexibility to maintain homeostasis in the face of osmotic pressure. Our study provides insight into evolutionary traits as well as facilitates identification of veneroids. PMID:26343338

  20. Micro and nanofluidic structures for cell sorting and genomic analysis

    Science.gov (United States)

    Morton, Keith J.

    Microfluidic systems promise rapid analysis of small samples in a compact and inexpensive format. But direct scaling of lab bench protocols on-chip is challenging because laminar flows in typical microfluidic devices are characterized by non-mixing streamlines. Common microfluidic mixers and sorters work by diffusion, limiting application to objects that diffuse slowly such as cells and DNA. Recently Huang et.al. developed a passive microfluidic element to continuously separate bio-particles deterministically. In Deterministic Lateral Displacement (DLD), objects are sorted by size as they transit an asymmetric array of microfabricated posts. This thesis further develops DLD arrays with applications in three broad new areas. First the arrays are used, not simply to sort particles, but to move streams of cells through functional flows for chemical treatment---such as on-chip immunofluorescent labeling of blood cells with washing, and on-chip E.coli cell lysis with simultaneous chromosome extraction. Secondly, modular tiling of the basic DLD element is used to construct complex particle handling modes that include beam steering for jets of cells and beads. Thirdly, nanostructured DLD arrays are built using Nanoimprint Lithography (NIL) and continuous-flow separation of 100 nm and 200 nm size particles is demonstrated. Finally a number of ancillary nanofabrication techniques were developed in support of these overall goals, including methods to interface nanofluidic structures with standard microfluidic components such as inlet channels and reservoirs, precision etching of ultra-high aspect ratio (>50:1) silicon nanostructures, and fabrication of narrow (˜ 35 nm) channels used to stretch genomic length DNA.

  1. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate

    Science.gov (United States)

    Asenjo, Freddy; Olmos, Alejandro; Henríquez-Piskulich, Patricia; Polanco, Victor; Aldea, Patricia

    2016-01-01

    Background. The honey bee (Apis mellifera) is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2) from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and replication components

  2. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    Science.gov (United States)

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp. PMID:26724943

  3. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  4. Gene context analysis in the Integrated Microbial Genomes (IMG data management system.

    Directory of Open Access Journals (Sweden)

    Konstantinos Mavromatis

    Full Text Available Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across phylogenetically diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  5. The complete mitochondrial genome of rabbit pinworm Passalurus ambiguus: genome characterization and phylogenetic analysis.

    Science.gov (United States)

    Liu, Guo-Hua; Li, Sheng; Zou, Feng-Cai; Wang, Chun-Ren; Zhu, Xing-Quan

    2016-01-01

    Passalurus ambiguus (Nematda: Oxyuridae) is a common pinworm which parasitizes in the caecum and colon of rabbits. Despite its significance as a pathogen, the epidemiology, genetics, systematics, and biology of this pinworm remain poorly understood. In the present study, we sequenced the complete mitochondrial (mt) genome of P. ambiguus. The circular mt genome is 14,023 bp in size and encodes of 36 genes, including 12 protein-coding, two ribosomal RNA, and 22 transfer RNA genes. The mt gene order of P. ambiguus is the same as that of Wellcomia siamensis, but distinct from that of Enterobius vermicularis. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference (BI) showed that P. ambiguus was more closely related to W. siamensis than to E. vermicularis. This mt genome provides novel genetic markers for studying the molecular epidemiology, population genetics, systematics of pinworm of animals and humans, and should have implications for the diagnosis, prevention, and control of passaluriasis in rabbits and other animals. PMID:26472717

  6. EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Vorhölter Frank-Jörg

    2009-05-01

    Full Text Available Abstract Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface http://edgar.cebitec.uni-bielefeld.de, where the precomputed data sets can be browsed.

  7. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    Energy Technology Data Exchange (ETDEWEB)

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  8. Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data

    OpenAIRE

    Edwards, David J.; Holt, Kathryn E.

    2013-01-01

    High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of patho...

  9. Ten years of maintaining and expanding a microbial genome and metagenome analysis system.

    Science.gov (United States)

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2015-11-01

    Launched in March 2005, the Integrated Microbial Genomes (IMG) system is a comprehensive data management system that supports multidimensional comparative analysis of genomic data. At the core of the IMG system is a data warehouse that contains genome and metagenome datasets sequenced at the Joint Genome Institute or provided by scientific users, as well as public genome datasets available at the National Center for Biotechnology Information Genbank sequence data archive. Genomes and metagenome datasets are processed using IMG's microbial genome and metagenome sequence data processing pipelines and are integrated into the data warehouse using IMG's data integration toolkits. Microbial genome and metagenome application specific data marts and user interfaces provide access to different subsets of IMG's data and analysis toolkits. This review article revisits IMG's original aims, highlights key milestones reached by the system during the past 10 years, and discusses the main challenges faced by a rapidly expanding system, in particular the complexity of maintaining such a system in an academic setting with limited budgets and computing and data management infrastructure. PMID:26439299

  10. A complete mitochondrial genome sequence of Ogura-type male-sterile cytoplasm and its comparative analysis with that of normal cytoplasm in radish (Raphanus sativus L.

    Directory of Open Access Journals (Sweden)

    Tanaka Yoshiyuki

    2012-07-01

    Full Text Available Abstract Background Plant mitochondrial genome has unique features such as large size, frequent recombination and incorporation of foreign DNA. Cytoplasmic male sterility (CMS is caused by rearrangement of the mitochondrial genome, and a novel chimeric open reading frame (ORF created by shuffling of endogenous sequences is often responsible for CMS. The Ogura-type male-sterile cytoplasm is one of the most extensively studied cytoplasms in Brassicaceae. Although the gene orf138 has been isolated as a determinant of Ogura-type CMS, no homologous sequence to orf138 has been found in public databases. Therefore, how orf138 sequence was created is a mystery. In this study, we determined the complete nucleotide sequence of two radish mitochondrial genomes, namely, Ogura- and normal-type genomes, and analyzed them to reveal the origin of the gene orf138. Results Ogura- and normal-type mitochondrial genomes were assembled to 258,426-bp and 244,036-bp circular sequences, respectively. Normal-type mitochondrial genome contained 33 protein-coding and three rRNA genes, which are well conserved with the reported mitochondrial genome of rapeseed. Ogura-type genomes contained same genes and additional atp9. As for tRNA, normal-type contained 17 tRNAs, while Ogura-type contained 17 tRNAs and one additional trnfM. The gene orf138 was specific to Ogura-type mitochondrial genome, and no sequence homologous to it was found in normal-type genome. Comparative analysis of the two genomes revealed that radish mitochondrial genome consists of 11 syntenic regions (length >3 kb, similarity >99.9%. It was shown that short repeats and overlapped repeats present in the edge of syntenic regions were involved in recombination events during evolution to interconvert two types of mitochondrial genome. Ogura-type mitochondrial genome has four unique regions (2,803 bp, 1,601 bp, 451 bp and 15,255 bp in size that are non-syntenic to normal-type genome, and the gene orf138

  11. Funding Opportunity: Genomic Data Centers

    Science.gov (United States)

    Funding Opportunity CCG, Funding Opportunity Center for Cancer Genomics, CCG, Center for Cancer Genomics, CCG RFA, Center for cancer genomics rfa, genomic data analysis network, genomic data analysis network centers,

  12. Functional Genomic Analysis of Systemic Cell Division Regulation in Legumes

    International Nuclear Information System (INIS)

    associated with nitrogen fertilization [1]. Nodulation involves the production of a new organ capable of nitrogen fixation [2] and as such is an excellent system to study plant - microbe interaction, plant development, long distance signaling and functional genomics of stem cell proliferation [3, 4]. Concerted international effort over the last 20 years, using a combination of induced mutagenesis followed by gene discovery (forward genetics), and molecular/biochemical approaches revealed a complex developmental pathway that 'loans' genetic programs from various sources and orchestrates these into a novel contribution. We report our laboratory's contribution to the present analysis in the field. (author)

  13. Insights from the GC content analysis of 76genome survey sequences (GSS) from Elaeisoleiferaψ

    Science.gov (United States)

    Bhore, Subhash J; Kassim, Amelia; Shah, Farida H

    2010-01-01

    South American oil-palm (Elaeis oleifera) is not cultivated in tropical countries like Malaysia on large scale due to low yield of palm oil derived from its fruit mesocarp. However, its fruit mesocarp oil contains about 68.6 % oleic acid (C18:1) which is more than double in comparison to commercially cultivated oilpalm, E. guineensis Jacq Tenera (hybrid of Dura (♀) x Pisifera (♂)). It is also known that E. oleifera is a good source of tocotrienols and carotenoids. Therefore, it is of interest to know the genome sequence of E. oleifera. The objective of this study is to generate genome survey sequences (GSS) to get GC content insight in the E. oleifera genome. The nuclear genomic DNA isolated from young leaf‐tissues was digested with EcoRI and NdeI/DraI restriction enzymes; and three genomic DNA libraries were constructed using Lambda ZAP‐II, pGEM®‐T Easy, and pDONR 222™ as cloning vectors. Generated 76 GSSs were analyzed by using Bioinformatics tools. The analysis result indicates that the adenine, cytosine, guanine and thymine content in generated GSSs are 30%, 20%, 20%, and 30% respectively. In conclusion, based on the precise GC content analysis of the randomly isolated 76 GSSs by using Bioinformatics tools we hypothesize that GC content in E. oleifera genome is 40%. The hypothesized 40% GC content in E. oleifera genome is expected to remain close to the GC content based on the whole genome analysis. ψThe nucleotide sequence data reported in this paper have been submitted to dbGSS division of the international DNA database (GenBank/DDBJ/EMBL) under accession numbers: DX575945- DX575972 and EI798032-EI798079. Abbreviations gDNA - Nuclear genomic DNA, GSSs - Genome survey sequences K12, SAOP - South American oil‐palm Db1 PMID:21364775

  14. A Mathematical Framework for Incorporating Anatomical Knowledge in DT-MRI Analysis

    OpenAIRE

    Maddah, Mahnaz; Zöllei, Lilla; Grimson, W. Eric L.; Westin, Carl-Fredrik; Wells, William M.

    2008-01-01

    We propose a Bayesian approach to incorporate anatomical information in the clustering of fiber trajectories. An expectation-maximization (EM) algorithm is used to cluster the trajectories, in which an atlas serves as the prior on the labels. The atlas guides the clustering algorithm and makes the resulting bundles anatomically meaningful. In addition, it provides the seed points for the tractography and initial settings of the EM algorithm. The proposed approach provides a robust and automat...

  15. Between Economic and Legal Analysis of Incorporated Things: a Critical "NO" to Aedilitian Remedies

    Directory of Open Access Journals (Sweden)

    CG Kilian

    2006-01-01

    Full Text Available This article analyses the dictum of the Phame v Paizes 1973 3 397 (A within economic and legal principles to determine whether incorporeal things could possess characteristics of value or quality characteristics as in the case of corporeal things. The author uses practical economic examples to argue for the development of common law. The author identifies relevant Roman law principles which justify the legal nature of incorporeal things. It is demonstrated that the value of incorporeal things depends greatly on future circumstances. It is argued in this article that the courts’ willingness to extend the Aedilitian remedies and the wide interpretation of a dictum et promissum create an open door for any unsatisfied buyer with no entrepreneurial skills to claim a reduced price if the business is unable to achieve similar financial results to those prior to the conclusion of the contract. Currently the seller of a business has no clear or enforceable defense under these circumstances. The author subsequently suggests that relevant Roman law principles should be revisited in the aim to develop an appropriate defense for the seller.

  16. Analysis on n-gram statistics and linguistic features of whole genome protein sequences

    Institute of Scientific and Technical Information of China (English)

    DONG Qi-wen; WANG Xiao-long; LIN Lei

    2008-01-01

    To obtain the statistical sequence analysis on a large number of genomic and proteomie sequences available for different organisms,the n-grams of whole genome protein sequences from 20 organisms were extracted.Their linguistic features were analyzed by two tests:Zipf power law and Shannon entropy,developed for analysis of natural languages and symbolic sequences.The natural genome proteins and the artificial genome proteins were compared with each other and some statistical features of n-grams were discovered.The results show that:the n-grams of whole genome protein sequences approximately follow the Zipf law when n is larger than 4;the Shannon n-gram entropy of natural genome proteins is lower than that of artificial proteins;a simple unigram model can distinguish different organisms;there exist organism-specific usages of "phrases" in protein sequences.It is suggested that further detailed analysis on n-gram of whole genome protein sequences will result in a powerful model for mapping the relationship of protein sequence,structure and function.

  17. Metabolites production improvement by identifying minimal genomes and essential genes using flux balance analysis.

    Science.gov (United States)

    Salleh, Abdul Hakim Mohamed; Mohamad, Mohd Saberi; Deris, Safaai; Illias, Rosli Md

    2015-01-01

    With the advancement in metabolic engineering technologies, reconstruction of the genome of host organisms to achieve desired phenotypes can be made. However, due to the complexity and size of the genome scale metabolic network, significant components tend to be invisible. We proposed an approach to improve metabolite production that consists of two steps. First, we find the essential genes and identify the minimal genome by a single gene deletion process using Flux Balance Analysis (FBA) and second by identifying the significant pathway for the metabolite production using gene expression data. A genome scale model of Saccharomyces cerevisiae for production of vanillin and acetate is used to test this approach. The result has shown the reliability of this approach to find essential genes, reduce genome size and identify production pathway that can further optimise the production yield. The identified genes and pathways can be extendable to other applications especially in strain optimisation. PMID:26489144

  18. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    Science.gov (United States)

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-01

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health. PMID:16341006

  19. Meta-analysis of genome-wide linkage scans of attention deficit hyperactivity disorder.

    Science.gov (United States)

    Zhou, Kaixin; Dempfle, Astrid; Arcos-Burgos, Mauricio; Bakker, Steven C; Banaschewski, Tobias; Biederman, Joseph; Buitelaar, Jan; Castellanos, F Xavier; Doyle, Alysa; Ebstein, Richard P; Ekholm, Jenny; Forabosco, Paola; Franke, Barbara; Freitag, Christine; Friedel, Susann; Gill, Michael; Hebebrand, Johannes; Hinney, Anke; Jacob, Christian; Lesch, Klaus Peter; Loo, Sandra K; Lopera, Francisco; McCracken, James T; McGough, James J; Meyer, Jobst; Mick, Eric; Miranda, Ana; Muenke, Maximilian; Mulas, Fernando; Nelson, Stanley F; Nguyen, T Trang; Oades, Robert D; Ogdie, Matthew N; Palacio, Juan David; Pineda, David; Reif, Andreas; Renner, Tobias J; Roeyers, Herbert; Romanos, Marcel; Rothenberger, Aribert; Schäfer, Helmut; Sergeant, Joseph; Sinke, Richard J; Smalley, Susan L; Sonuga-Barke, Edmund; Steinhausen, Hans-Christoph; van der Meulen, Emma; Walitza, Susanne; Warnke, Andreas; Lewis, Cathryn M; Faraone, Stephen V; Asherson, Philip

    2008-12-01

    Genetic contribution to the development of attention deficit hyperactivity disorder (ADHD) is well established. Seven independent genome-wide linkage scans have been performed to map loci that increase the risk for ADHD. Although significant linkage signals were identified in some of the studies, there has been limited replications between the various independent datasets. The current study gathered the results from all seven of the ADHD linkage scans and performed a Genome Scan Meta Analysis (GSMA) to identify the genomic region with most consistent linkage evidence across the studies. Genome-wide significant linkage (P(SR) = 0.00034, P(OR) = 0.04) was identified on chromosome 16 between 64 and 83 Mb. In addition there are nine other genomic regions from the GSMA showing nominal or suggestive evidence of linkage. All these linkage results may be informative and focus the search for novel ADHD susceptibility genes. PMID:18988193

  20. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    Science.gov (United States)

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution. PMID:27129539

  1. Genome-wide array comparative genomic hybridization analysis reveals distinct amplifications in osteosarcoma

    International Nuclear Information System (INIS)

    Osteosarcoma is a highly malignant bone neoplasm of children and young adults. It is characterized by extremely complex karyotypes and high frequency of chromosomal amplifications. Currently, only the histological response (degree of necrosis) to therapy represent gold standard for predicting the outcome in a patient with non-metastatic osteosarcoma at the time of definitive surgery. Patients with lower degree of necrosis have a higher risk of relapse and poor outcome even after chemotherapy and complete resection of the primary tumor. Therefore, a better understanding of the underlying molecular genetic events leading to tumor initiation and progression could result in the identification of potential diagnostic and therapeutic targets. We used a genome-wide screening method – array based comparative genomic hybridization (array-CGH) to identify DNA copy number changes in 48 patients with osteosarcoma. We applied fluorescence in situ hybridization (FISH) to validate some of amplified clones in this study. Clones showing gains (79%) were more frequent than losses (66%). High-level amplifications and homozygous deletions constitute 28.6% and 3.8% of tumor genome respectively. High-level amplifications were present in 238 clones, of which about 37% of them showed recurrent amplification. Most frequently amplified clones were mapped to 1p36.32 (PRDM16), 6p21.1 (CDC5L, HSPCB, NFKBIE), 8q24, 12q14.3 (IFNG), 16p13 (MGRN1), and 17p11.2 (PMP22 MYCD, SOX1,ELAC27). We validated some of the amplified clones by FISH from 6p12-p21, 8q23-q24, and 17p11.2 amplicons. Homozygous deletions were noted for 32 clones and only 7 clones showed in more than one case. These 7 clones were mapped to 1q25.1 (4 cases), 3p14.1 (4 cases), 13q12.2 (2 cases), 4p15.1 (2 cases), 6q12 (2 cases), 6q12 (2 cases) and 6q16.3 (2 cases). This study clearly demonstrates the utility of array CGH in defining high-resolution DNA copy number changes and refining amplifications. The resolution of array CGH

  2. Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome

    Directory of Open Access Journals (Sweden)

    Dougan Gordon

    2009-12-01

    Full Text Available Abstract Background Host defense peptides are a critical component of the innate immune system. Human alpha- and beta-defensin genes are subject to copy number variation (CNV and historically the organization of mouse alpha-defensin genes has been poorly defined. Here we present the first full manual genomic annotation of the mouse defensin region on Chromosome 8 of the reference strain C57BL/6J, and the analysis of the orthologous regions of the human and rat genomes. Problems were identified with the reference assemblies of all three genomes. Defensins have been studied for over two decades and their naming has become a critical issue due to incorrect identification of defensin genes derived from different mouse strains and the duplicated nature of this region. Results The defensin gene cluster region on mouse Chromosome 8 A2 contains 98 gene loci: 53 are likely active defensin genes and 22 defensin pseudogenes. Several TATA box motifs were found for human and mouse defensin genes that likely impact gene expression. Three novel defensin genes belonging to the Cryptdin Related Sequences (CRS family were identified. All additional mouse defensin loci on Chromosomes 1, 2 and 14 were annotated and unusual splice variants identified. Comparison of the mouse alpha-defensins in the three main mouse reference gene sets Ensembl, Mouse Genome Informatics (MGI, and NCBI RefSeq reveals significant inconsistencies in annotation and nomenclature. We are collaborating with the Mouse Genome Nomenclature Committee (MGNC to establish a standardized naming scheme for alpha-defensins. Conclusions Prior to this analysis, there was no reliable reference gene set available for the mouse strain C57BL/6J defensin genes, demonstrating that manual intervention is still critical for the annotation of complex gene families and heavily duplicated regions. Accurate gene annotation is facilitated by the annotation of pseudogenes and regulatory elements. Manually curated gene

  3. Rapid mass spectrometric analysis of 15N-Leu incorporation fidelity during preparation of specifically labeled NMR samples

    Science.gov (United States)

    Truhlar, Stephanie M.E.; Cervantes, Carla F.; Torpey, Justin W.; Kjaergaard, Magnus; Komives, Elizabeth A.

    2008-01-01

    Advances in NMR spectroscopy have enabled the study of larger proteins that typically have significant overlap in their spectra. Specific 15N-amino acid incorporation is a powerful tool for reducing spectral overlap and attaining reliable sequential assignments. However, scrambling of the label during protein expression is a common problem. We describe a rapid method to evaluate the fidelity of specific 15N-amino acid incorporation. The selectively labeled protein is proteolyzed, and the resulting peptides are analyzed using MALDI mass spectrometry. The 15N incorporation is determined by analyzing the isotopic abundance of the peptides in the mass spectra using the program DEX. This analysis determined that expression with a 10-fold excess of unlabeled amino acids relative to the 15N-amino acid prevents the scrambling of the 15N label that is observed when equimolar amounts are used. MALDI TOF-TOF MS/MS data provide additional information that shows where the “extra” 15N labels are incorporated, which can be useful in confirming ambiguous assignments. The described procedure provides a rapid technique to monitor the fidelity of selective labeling that does not require a lot of protein. These advantages make it an ideal way of determining optimal expression conditions for selectively labeled NMR samples. PMID:18567787

  4. Initial analysis of copy number variation in the cow genome

    Science.gov (United States)

    As a complement to the Bovine HapMap Consortium project, we initiated a systematic study of the CNV within the same cattle population using array comparative genomic hybridization (array CGH). Oligonucleotide CGH arrays were designed and fabricated to cover all chromosomes with an average interval ...

  5. Analysis of Aspergillus nidulans metabolism at the genome-scale

    DEFF Research Database (Denmark)

    David, Helga; Ozcelik, İlknur Ş; Hofmann, Gerald;

    2008-01-01

    function. Results: In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways of...

  6. Genomic Analysis of Secondary Metabolite Production by Pseudomonas fluorescens

    Science.gov (United States)

    Pseudomonas fluorescens is a diverse bacterial species known for its ubiquity in natural habitats and its production of secondary metabolites. The high degree of ecological and metabolic diversity represented in P. fluorescens is reflected in the genomic diversity displayed among strains. Certain st...

  7. Transcriptome and genome size analysis of the venus flytrap

    DEFF Research Database (Denmark)

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon;

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D...

  8. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  9. Online Genome Analysis Resources for Educators, a Comparative Review

    OpenAIRE

    Sarah Grace Prescott

    2012-01-01

    A comparative review of several companies that offer similar kits or services that allow students to isolate DNA (human and others), amplify it by PCR, and in some cases sequence the resulting sample.  The companies include:  Carolina® Biological Supply Company, Bio-Rad®, Edvotek® Inc., Hiram Genomics Store, and 23andMe.

  10. Nuclear genome size analysis of Agave tequilana Weber

    Czech Academy of Sciences Publication Activity Database

    Palomino, G.; Doležel, Jaroslav; Méndez, I.; Rubluo, A.

    2003-01-01

    Roč. 56, č. 1 (2003), s. 37-46. ISSN 0008-7114 Grant ostatní: Itálie(IT) Z5038910 Institutional research plan: CEZ:AV0Z5038910 Keywords : Flow cytometry * nuclear genome size * Agave tequilana Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 0.337, year: 2003

  11. Complete sequence of the mitochondrial genome of Odontamblyopus rubicundus (Perciformes: Gobiidae): genome characterization and phylogenetic analysis

    Indian Academy of Sciences (India)

    Tianxing Liu; Xiaoxiao Jin; Rixin Wang; Tianjun Xu

    2013-12-01

    Odontamblyopus rubicundus is a species of gobiid fishes, inhabits muddy-bottomed coastal waters. In this paper, the first complete mitochondrial genome sequence of O. rubicundus is reported. The complete mitochondrial genome sequence is 17119 bp in length and contains 13 protein-coding genes, two rRNA genes, 22 tRNA genes, a control region and an L-strand origin as in other teleosts. Most mitochondrial genes are encoded on H-strand except for ND6 and seven tRNA genes. Some overlaps occur in protein-coding genes and tRNAs ranging from 1 to 7 bp. The possibly nonfunctional L-strand origin folded into a typical stem-loop secondary structure and a conserved motif (5′-GCCGG-3′) was found at the base of the stem within the $tRNA^{Cys}$ gene. The TAS, CSB-2 and CSB-3 could be detected in the control region. However, in contrast to most of other fishes, the central conserved sequence block domain and the CSB-1 could not be recognized in O. rubicundus, which is consistent with Acanthogobius hasta (Gobiidae). In addition, phylogenetic analyses based on different sequences of species of Gobiidae and different methods showed that the classification of O. rubicundus into Odontamblyopus due to morphology is debatable.

  12. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes.

    Directory of Open Access Journals (Sweden)

    Heng Xiang

    Full Text Available The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98% of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6% optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species.

  13. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes.

    Science.gov (United States)

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98%) of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6%) optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species. PMID:26057384

  14. Evidence and evolutionary analysis of ancient whole-genome duplication in barley predating the divergence from rice

    Directory of Open Access Journals (Sweden)

    Grosse Ivo

    2009-08-01

    Full Text Available Abstract Background Well preserved genomic colinearity among agronomically important grass species such as rice, maize, Sorghum, wheat and barley provides access to whole-genome structure information even in species lacking a reference genome sequence. We investigated footprints of whole-genome duplication (WGD in barley that shaped the cereal ancestor genome by analyzing shared synteny with rice using a ~2000 gene-based barley genetic map and the rice genome reference sequence. Results Based on a recent annotation of the rice genome, we reviewed the WGD in rice and identified 24 pairs of duplicated genomic segments involving 70% of the rice genome. Using 968 putative orthologous gene pairs, synteny covered 89% of the barley genetic map and 63% of the rice genome. We found strong evidence for seven shared segmental genome duplications, corresponding to more than 50% of the segmental genome duplications previously determined in rice. Analysis of synonymous substitution rates (Ks suggested that shared duplications originated before the divergence of these two species. While major genome rearrangements affected the ancestral genome of both species, small paracentric inversions were found to be species specific. Conclusion We provide a thorough analysis of comparative genome evolution between barley and rice. A barley genetic map of approximately 2000 non-redundant EST sequences provided sufficient density to allow a detailed view of shared synteny with the rice genome. Using an indirect approach that included the localization of WGD-derived duplicated genome segments in the rice genome, we determined the current extent of shared WGD-derived genome duplications that occurred prior to species divergence.

  15. A genome-wide association study of autism incorporating autism diagnostic interview-revised, autism diagnostic observation schedule, and social responsiveness scale.

    Science.gov (United States)

    Connolly, John J; Glessner, Joseph T; Hakonarson, Hakon

    2013-01-01

    Efforts to understand the causes of autism spectrum disorders (ASDs) have been hampered by genetic complexity and heterogeneity among individuals. One strategy for reducing complexity is to target endophenotypes, simpler biologically based measures that may involve fewer genes and constitute a more homogenous sample. A genome-wide association study of 2,165 participants (mean age = 8.95 years) examined associations between genomic loci and individual assessment items from the Autism Diagnostic Interview-Revised, Autism Diagnostic Observation Schedule, and Social Responsiveness Scale. Significant associations with a number of loci were identified, including KCND2 (overly serious facial expressions), NOS2A (loss of motor skills), and NELL1 (faints, fits, or blackouts). These findings may help prioritize directions for future genomic efforts. PMID:22935194

  16. Comparative genome analysis of Spiroplasma melliferum IPMB4A, a honeybee-associated bacterium

    Directory of Open Access Journals (Sweden)

    Lo Wen-Sui

    2013-01-01

    Full Text Available Abstract Background The genus Spiroplasma contains a group of helical, motile, and wall-less bacteria in the class Mollicutes. Similar to other members of this class, such as the animal-pathogenic Mycoplasma and the plant-pathogenic ‘Candidatus Phytoplasma’, all characterized Spiroplasma species were found to be associated with eukaryotic hosts. While most of the Spiroplasma species appeared to be harmless commensals of insects, a small number of species have evolved pathogenicity toward various arthropods and plants. In this study, we isolated a novel strain of honeybee-associated S. melliferum and investigated its genetic composition and evolutionary history by whole-genome shotgun sequencing and comparative analysis with other Mollicutes genomes. Results The whole-genome shotgun sequencing of S. melliferum IPMB4A produced a draft assembly that was ~1.1 Mb in size and covered ~80% of the chromosome. Similar to other Spiroplasma genomes that have been studied to date, we found that this genome contains abundant repetitive sequences that originated from plectrovirus insertions. These phage fragments represented a major obstacle in obtaining a complete genome sequence of Spiroplasma with the current sequencing technology. Comparative analysis of S. melliferum IPMB4A with other Spiroplasma genomes revealed that these phages may have facilitated extensive genome rearrangements in these bacteria and contributed to horizontal gene transfers that led to species-specific adaptation to different eukaryotic hosts. In addition, comparison of gene content with other Mollicutes suggested that the common ancestor of the SEM (Spiroplasma, Entomoplasma, and Mycoplasma clade may have had a relatively large genome and flexible metabolic capacity; the extremely reduced genomes of present day Mycoplasma and ‘Candidatus Phytoplasma’ species are likely to be the result of independent gene losses in these lineages. Conclusions The findings in this study

  17. Organization and comparative analysis of the mitochondrial genomes of bioluminescent Elateroidea (Coleoptera: Polyphaga).

    Science.gov (United States)

    Amaral, Danilo T; Mitani, Yasuo; Ohmiya, Yoshihiro; Viviani, Vadim R

    2016-07-25

    Mitochondrial genome organization in the Elateroidea superfamily (Coleoptera), which include the main families of bioluminescent beetles, has been poorly studied and lacking information about Phengodidae family. We sequenced the mitochondrial genomes of Neotropical Lampyridae (Bicellonycha lividipennis), Phengodidae (Brasilocerus sp.2 and Phrixothrix hirtus) and Elateridae (Pyrearinus termitilluminans, Hapsodrilus ignifer and Teslasena femoralis). All species had a typical insect mitochondrial genome except for the following: in the elaterid T. femoralis genome there is a non-coding region between NADH2 and tRNA-Trp; in the phengodids Brasilocerus sp.2 and P. hirtus genomes we did not find the tRNA-Ile and tRNA-Gln. The P. hirtus genome showed a ~1.6kb non-coding region, the rearrangement of tRNA-Tyr, a new tRNA-Leu copy, and several regions with higher AT contents. Phylogenetics analysis using Bayesian and ML models indicated that the Phengodidae+Rhagophthalmidae are closely related to Lampyridae family, and included Drilus flavescens (Drilidae) as an internal clade within Elateridae. This is the first report that compares the mitochondrial genomes organization of the three main families of bioluminescent Elateroidea, including the first Neotropical Lampyridae and Phengodidae. The losses of tRNAs, and translocation and duplication events found in Phengodidae mt genomes, mainly in P. hirtus, may indicate different evolutionary rates in these mitochondrial genomes. The mitophylogenomics analysis indicates the monophyly of the three bioluminescent families and a closer relationship between Lampyridae and Phengodidae/Rhagophthalmidae, in contrast with previous molecular analysis. PMID:27060405

  18. Functional analysis of gapped microbial genomes: Amino acid metabolism of Thiobacillus ferrooxidans

    OpenAIRE

    Selkov, Evgeni; Overbeek, Ross; Kogan, Yakov; Chu, Lien; Vonstein, Veronika; Holmes, David; Silver, Simon; Haselkorn, Robert; Fonstein, Michael

    2000-01-01

    A gapped genome sequence of the biomining bacterium Thiobacillus ferrooxidans strain ATCC23270 was assembled from sheared DNA fragments (3.2-times coverage) into 1,912 contigs. A total of 2,712 potential genes (ORFs) were identified in 2.6 Mbp (megabase pairs) of Thiobacillus genomic sequence. Of these genes, 2,159 could be assigned functions by using the WIT-Pro/EMP genome analysis system, most with a high degree of certainty. Nine hundred of the genes have been assigned roles in metabolic p...

  19. VIBA-LAB2: a virtual ion beam analysis laboratory software package incorporating elemental map simulations

    International Nuclear Information System (INIS)

    The software package VIBA-lab1, which incorporates PIXE and RBS energy spectra simulation has now been extended to include the simulation of elemental maps from 3D structures. VIBA-lab1 allows the user to define a wide variety of experimental parameters, e.g. energy and species of incident ions, excitation and detection geometry, etc. When the relevant experimental parameters as well as target composition are defined, the program can then simulate the corresponding PIXE and RBS spectra. VIBA-LAB2 has been written with applications in nuclear microscopy in mind. A set of drag-and-drop tools has been incorporated to allow the user to define a three-dimensional sample object of mixed elemental composition. PIXE energy spectra simulations are then carried out on pixel-by-pixel basis and the corresponding intensity distributions or elemental maps can be computed. Several simulated intensity distributions for some 3D objects are demonstrated, and simulations obtained from a simple IC are compared with experimental results

  20. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    Science.gov (United States)

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities. PMID:27006240

  1. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    Directory of Open Access Journals (Sweden)

    Sarwar Azam

    2016-01-01

    Full Text Available Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis.

  2. Quantitative analysis of polycomb response elements (PREs at identical genomic locations distinguishes contributions of PRE sequence and genomic environment

    Directory of Open Access Journals (Sweden)

    Okulski Helena

    2011-03-01

    Full Text Available Abstract Background Polycomb/Trithorax response elements (PREs are cis-regulatory elements essential for the regulation of several hundred developmentally important genes. However, the precise sequence requirements for PRE function are not fully understood, and it is also unclear whether these elements all function in a similar manner. Drosophila PRE reporter assays typically rely on random integration by P-element insertion, but PREs are extremely sensitive to genomic position. Results We adapted the ΦC31 site-specific integration tool to enable systematic quantitative comparison of PREs and sequence variants at identical genomic locations. In this adaptation, a miniwhite (mw reporter in combination with eye-pigment analysis gives a quantitative readout of PRE function. We compared the Hox PRE Frontabdominal-7 (Fab-7 with a PRE from the vestigial (vg gene at four landing sites. The analysis revealed that the Fab-7 and vg PREs have fundamentally different properties, both in terms of their interaction with the genomic environment at each site and their inherent silencing abilities. Furthermore, we used the ΦC31 tool to examine the effect of deletions and mutations in the vg PRE, identifying a 106 bp region containing a previously predicted motif (GTGT that is essential for silencing. Conclusions This analysis showed that different PREs have quantifiably different properties, and that changes in as few as four base pairs have profound effects on PRE function, thus illustrating the power and sensitivity of ΦC31 site-specific integration as a tool for the rapid and quantitative dissection of elements of PRE design.

  3. Registered plant list - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods Registered plant... simple search page) Genome analysis methods Presence or absence of Genome analysis methods information in t...his DB (link to the Genome analysis methods information in simple search page) Joomla SEF URLs by Artio Abou...base Site Policy | Contact Us Registered plant list - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive ...

  4. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Martínez-Godoy, M. Ángeles; Mauri, Nuria; Juárez, José; Marqués, M.Carmen; Santiago, Julia; Forment, Javier; Gadea Vacas, José

    2008-01-01

    Background: Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genomewide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results: We have designed and constructed a publicly available ...

  5. Meta-analysis of 32 genome-wide linkage studies of schizophrenia

    OpenAIRE

    Ng, M.Y.M.; Levinson, D. F.; Faraone, S.V.; Suarez, B.K.; DeLisi, L.E.; Arinami, T; Riley, B; Paunio, T; Pulver, A E; Irmansyah,; Holmans, P. A.; M. Escamilla; Wildenauer, D. B.; Williams, N. M.; Laurent, C.

    2009-01-01

    A genome scan meta-analysis (GSMA) was carried out on 32 independent genome-wide linkage scan analyses that included 3255 pedigrees with 7413 genotyped cases affected with schizophrenia (SCZ) or related disorders. The primary GSMA divided the autosomes into 120 bins, rank-ordered the bins within each study according to the most positive linkage result in each bin, summed these ranks (weighted for study size) for each bin across studies and determined the empirical probability of a given summe...

  6. Meta-analysis of genome-wide association studies of attention deficit/hyperactivity disorder

    OpenAIRE

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schaefer, Helmut; Holmans, Peter; Daly, Mark; STEINHAUSEN, HANS-CHRISTOPH; Freitag, Christine,; Reif, Andreas; Tobias J Renner

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of existing studies to boost statistical power. Method: We used data from four projects: a) the Children's Hospital of Philadelphia (CHOP); b) phase I of ...

  7. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira

    OpenAIRE

    Fouts, Derrick E.; Matthias, Michael A.; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E.; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L.; Haake, David A.; Haft, Daniel H.; Hartskeerl, Rudy; Ko, Albert I.; Levett, Paul N

    2016-01-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field....

  8. Comparative genomic analysis of two-component regulatory proteins in Pseudomonas syringae

    Directory of Open Access Journals (Sweden)

    Ussery David W

    2007-10-01

    Full Text Available Abstract Background Pseudomonas syringae is a widespread bacterial plant pathogen, and strains of P. syringae may be assigned to different pathovars based on host specificity among different plant species. The genomes of P. syringae pv. syringae (Psy B728a, pv. tomato (Pto DC3000 and pv. phaseolicola (Pph 1448A have been recently sequenced providing a major resource for comparative genomic analysis. A mechanism commonly found in bacteria for signal transduction is the two-component system (TCS, which typically consists of a sensor histidine kinase (HK and a response regulator (RR. P. syringae requires a complex array of TCS proteins to cope with diverse plant hosts, host responses, and environmental conditions. Results Based on the genomic data, pattern searches with Hidden Markov Model (HMM profiles have been used to identify putative HKs and RRs. The genomes of Psy B728a, Pto DC3000 and Pph 1448A were found to contain a large number of genes encoding TCS proteins, and a core of complete TCS proteins were shared between these genomes: 30 putative TCS clusters, 11 orphan HKs, 33 orphan RRs, and 16 hybrid HKs. A close analysis of the distribution of genes encoding TCS proteins revealed important differences in TCS proteins among the three P. syringae pathovars. Conclusion In this article we present a thorough analysis of the identification and distribution of TCS proteins among the sequenced genomes of P. syringae. We have identified differences in TCS proteins among the three P. syringae pathovars that may contribute to their diverse host ranges and association with plant hosts. The identification and analysis of the repertoire of TCS proteins in the genomes of P. syringae pathovars constitute a basis for future functional genomic studies of the signal transduction pathways in this important bacterial phytopathogen.

  9. Systematic Pharmacogenomics Analysis of a Malay Whole Genome: Proof of Concept for Personalized Medicine

    OpenAIRE

    Salleh, Mohd Zaki; Teh, Lay Kek; Lee, Lian Shien; Ismet, Rose Iszati; Patowary, Ashok; Joshi, Kandarp; Pasha, Ayesha; Ahmed, Azni Zain; Janor, Roziah Mohd; Hamzah, Ahmad Sazali; Adam, Aishah; Yusoff, Khalid; Hoh, Boon Peng; Hatta, Fazleen Haslinda Mohd; Ismail, Mohamad Izwan

    2013-01-01

    Background With a higher throughput and lower cost in sequencing, second generation sequencing technology has immense potential for translation into clinical practice and in the realization of pharmacogenomics based patient care. The systematic analysis of whole genome sequences to assess patient to patient variability in pharmacokinetics and pharmacodynamics responses towards drugs would be the next step in future medicine in line with the vision of personalizing medicine. Methods Genomic DN...

  10. VISPA: a computational pipeline for the identification and analysis of genomic vector integration sites

    OpenAIRE

    Calabria, A.; Leo, S.; Benedicenti, F; Cesana, D.; Spinozzi, G; Orsini, M.(Laboratori Nazionali del Gran Sasso, Assergi, AQ, 67010, Italy); Merella, S; Stupka, E.; G. Zanetti; Montini, E

    2014-01-01

    The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integ...

  11. Multivariate genomic model improves analysis of oil palm (Elaeis guineensis Jacq.) progeny tests

    OpenAIRE

    Marchal, Alexandre; Legarra Albizu, Andres; Tisne, Sebastien; Carasco-Lacombe, Catherine; Manez, Aurore; Suryana, Edyana; Omoré, Alphonse; Nouy, Bruno; Durand-Gasselin, Tristan; Sanchez, Leopoldo; Bouvet, Jean-Marc; Cros, David

    2016-01-01

    Genomic selection is promising for plant breeding, particularly for perennial crops. Multivariate analysis, which considers several traits jointly, takes advantage of the genetic correlations to increase accuracy. The aim of this study was to empirically evaluate the potential of a univariate and multivariate genomic mixed model (G-BLUP) compared to the traditional univariate pedigree-based BLUP (T-BLUP) when analyzing progeny tests of oil palm, the world’s major oil crop. The dataset compris...

  12. IslandViewer 3: more flexible, interactive genomic island discovery, visualization and analysis

    OpenAIRE

    Dhillon, Bhavjinder K.; Laird, Matthew R.; Shay, Julie A.; Winsor, Geoffrey L; Lo, Raymond; Nizam, Fazmin; Pereira, Sheldon K.; Waglechner, Nicholas; McArthur, Andrew G.; Langille, Morgan G I; Brinkman, Fiona S. L.

    2015-01-01

    IslandViewer (http://pathogenomics.sfu.ca/islandviewer) is a widely used web-based resource for the prediction and analysis of genomic islands (GIs) in bacterial and archaeal genomes. GIs are clusters of genes of probable horizontal origin, and are of high interest since they disproportionately encode genes involved in medically and environmentally important adaptations, including antimicrobial resistance and virulence. We now report a major new release of IslandViewer, since the last release...

  13. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences

    OpenAIRE

    Schwartz, Scott; Elnitski, Laura; Li, Mei; Weirauch, Matt; Riemer, Cathy; Smit, Arian; Green, Eric D; Hardison, Ross C.; Miller, Webb

    2003-01-01

    Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs includ...

  14. Structural analysis of electrophoretic variation in the genome profiles of rotavirus field isolates.

    OpenAIRE

    Clarke, I. N.; McCrae, M A

    1982-01-01

    Detailed structural studies were undertaken on five isolates of bovine rotavirus which showed variability in the migration patterns of their genome segments on electrophoresis in polyacrylamide gels. The individual genome segments of each isolate were characterized by partial digestion of terminally radiolabeled RNA with a base-specific nuclease. This analysis showed that whereas mobility variations were always associated with detectable changes in nucleotide sequence, sequence changes at lea...

  15. Comparative genomic analysis of four representative plant growth-promoting rhizobacteria in Pseudomonas

    OpenAIRE

    Shen, Xuemei; Hu, Hongbo; Peng, Huasong; Wang, Wei; Zhang, Xuehong

    2013-01-01

    Background Some Pseudomonas strains function as predominant plant growth-promoting rhizobacteria (PGPR). Within this group, Pseudomonas chlororaphis and Pseudomonas fluorescens are non-pathogenic biocontrol agents, and some Pseudomonas aeruginosa and Pseudomonas stutzeri strains are PGPR. P. chlororaphis GP72 is a plant growth-promoting rhizobacterium with a fully sequenced genome. We conducted a genomic analysis comparing GP72 with three other pseudomonad PGPR: P. fluorescens Pf-5, P. aerugi...

  16. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease.

    Science.gov (United States)

    Nikpay, Majid; Goel, Anuj; Won, Hong-Hee; Hall, Leanne M; Willenborg, Christina; Kanoni, Stavroula; Saleheen, Danish; Kyriakou, Theodosios; Nelson, Christopher P; Hopewell, Jemma C; Webb, Thomas R; Zeng, Lingyao; Dehghan, Abbas; Alver, Maris; Armasu, Sebastian M; Auro, Kirsi; Bjonnes, Andrew; Chasman, Daniel I; Chen, Shufeng; Ford, Ian; Franceschini, Nora; Gieger, Christian; Grace, Christopher; Gustafsson, Stefan; Huang, Jie; Hwang, Shih-Jen; Kim, Yun Kyoung; Kleber, Marcus E; Lau, King Wai; Lu, Xiangfeng; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Mihailov, Evelin; Morrison, Alanna C; Pervjakova, Natalia; Qu, Liming; Rose, Lynda M; Salfati, Elias; Saxena, Richa; Scholz, Markus; Smith, Albert V; Tikkanen, Emmi; Uitterlinden, Andre; Yang, Xueli; Zhang, Weihua; Zhao, Wei; de Andrade, Mariza; de Vries, Paul S; van Zuydam, Natalie R; Anand, Sonia S; Bertram, Lars; Beutner, Frank; Dedoussis, George; Frossard, Philippe; Gauguier, Dominique; Goodall, Alison H; Gottesman, Omri; Haber, Marc; Han, Bok-Ghee; Huang, Jianfeng; Jalilzadeh, Shapour; Kessler, Thorsten; König, Inke R; Lannfelt, Lars; Lieb, Wolfgang; Lind, Lars; Lindgren, Cecilia M; Lokki, Marja-Liisa; Magnusson, Patrik K; Mallick, Nadeem H; Mehra, Narinder; Meitinger, Thomas; Memon, Fazal-ur-Rehman; Morris, Andrew P; Nieminen, Markku S; Pedersen, Nancy L; Peters, Annette; Rallidis, Loukianos S; Rasheed, Asif; Samuel, Maria; Shah, Svati H; Sinisalo, Juha; Stirrups, Kathleen E; Trompet, Stella; Wang, Laiyuan; Zaman, Khan S; Ardissino, Diego; Boerwinkle, Eric; Borecki, Ingrid B; Bottinger, Erwin P; Buring, Julie E; Chambers, John C; Collins, Rory; Cupples, L Adrienne; Danesh, John; Demuth, Ilja; Elosua, Roberto; Epstein, Stephen E; Esko, Tõnu; Feitosa, Mary F; Franco, Oscar H; Franzosi, Maria Grazia; Granger, Christopher B; Gu, Dongfeng; Gudnason, Vilmundur; Hall, Alistair S; Hamsten, Anders; Harris, Tamara B; Hazen, Stanley L; Hengstenberg, Christian; Hofman, Albert; Ingelsson, Erik; Iribarren, Carlos; Jukema, J Wouter; Karhunen, Pekka J; Kim, Bong-Jo; Kooner, Jaspal S; Kullo, Iftikhar J; Lehtimäki, Terho; Loos, Ruth J F; Melander, Olle; Metspalu, Andres; März, Winfried; Palmer, Colin N; Perola, Markus; Quertermous, Thomas; Rader, Daniel J; Ridker, Paul M; Ripatti, Samuli; Roberts, Robert; Salomaa, Veikko; Sanghera, Dharambir K; Schwartz, Stephen M; Seedorf, Udo; Stewart, Alexandre F; Stott, David J; Thiery, Joachim; Zalloua, Pierre A; O'Donnell, Christopher J; Reilly, Muredach P; Assimes, Themistocles L; Thompson, John R; Erdmann, Jeanette; Clarke, Robert; Watkins, Hugh; Kathiresan, Sekar; McPherson, Ruth; Deloukas, Panos; Schunkert, Heribert; Samani, Nilesh J; Farrall, Martin

    2015-10-01

    Existing knowledge of genetic variants affecting risk of coronary artery disease (CAD) is largely based on genome-wide association study (GWAS) analysis of common SNPs. Leveraging phased haplotypes from the 1000 Genomes Project, we report a GWAS meta-analysis of ∼185,000 CAD cases and controls, interrogating 6.7 million common (minor allele frequency (MAF) > 0.05) and 2.7 million low-frequency (0.005 < MAF < 0.05) variants. In addition to confirming most known CAD-associated loci, we identified ten new loci (eight additive and two recessive) that contain candidate causal genes newly implicating biological processes in vessel walls. We observed intralocus allelic heterogeneity but little evidence of low-frequency variants with larger effects and no evidence of synthetic association. Our analysis provides a comprehensive survey of the fine genetic architecture of CAD, showing that genetic susceptibility to this common disease is largely determined by common SNPs of small effect size. PMID:26343387

  17. A Genome-Wide Association Study of Autism Incorporating Autism Diagnostic Interview-Revised, Autism Diagnostic Observation Schedule, and Social Responsiveness Scale

    Science.gov (United States)

    Connolly, John J.; Glessner, Joseph T.; Hakonarson, Hakon

    2013-01-01

    Efforts to understand the causes of autism spectrum disorders (ASDs) have been hampered by genetic complexity and heterogeneity among individuals. One strategy for reducing complexity is to target endophenotypes, simpler biologically based measures that may involve fewer genes and constitute a more homogenous sample. A genome-wide association…

  18. Geographic isolates of Lymantria dispar multiple nucleopolyhedrovirus: Genome sequence analysis and pathogenicity against European and Asian gypsy moth strains

    Science.gov (United States)

    Geographic isolates of Lymantria dispar multiple nucleopolyhedrovirus: Genome sequence analysis and pathogenicity against European and Asian gypsy moth strains. To evaluate the genetic diversity of Lymantria dispar nucleopolyhedrovirus (LdMNPV) at the genomic level, the genomes of three isolates of...

  19. Genome analysis of the Anerobic Thermohalophilic bacterium Halothermothrix orenii

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, Konstantinos; Ivanova, Natalia; Anderson, Iain; Lykidis, Athanasios; Hooper, Sean D.; Sun, Hui; Kunin, Victor; Lapidus, Alla; Hugenholtz, Philip; Patel, Bharat; Kyrpides, Nikos C.

    2008-11-03

    Halothermothirx orenii is a strictly anaerobic thermohalophilic bacterium isolated from sediment of a Tunisian salt lake. It belongs to the order Halanaerobiales in the phylum Firmicutes. The complete sequence revealed that the genome consists of one circular chromosome of 2578146 bps encoding 2451 predicted genes. This is the first genome sequence of an organism belonging to the Haloanaerobiales. Features of both Gram positive and Gram negative bacteria were identified with the presence of both a sporulating mechanism typical of Firmicutes and a characteristic Gram negative lipopolysaccharide being the most prominent. Protein sequence analyses and metabolic reconstruction reveal a unique combination of strategies for thermophilic and halophilic adaptation. H. orenii can serve as a model organism for the study of the evolution of the Gram negative phenotype as well as the adaptation under thermohalophilic conditions and the development of biotechnological applications under conditions that require high temperatures and high salt concentrations.

  20. In silico analysis of SSRs in mitochondrial genomes of fishes.

    Science.gov (United States)

    Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar

    2015-04-01

    The availability of fish mitochondrial (mt) genomes provides an opportunity to explore the simple sequence repeats. In the present study, mt genomes of 85 fish species reported from Indian subcontinent were downloaded from NCBI and computationally analysed for finding SSRs types, frequency of occurrence, mutation and evolutionary adaptation across species. A total of 92 microsatellites in different nucleotide combinations were detected in 59 species. 26 interspersed SSRs, mostly poly (AT)n were found in the D-loop regions in the species of Cyprinidae. Fifty-six SSRs of 12 bp fixed length were observed in eight genes only. Further, identical repeat motifs were found on the same location in ATP6 and ND4 genes, which were biased towards particular habitat. The comparison of ATP6 and ND4 gene sets to other homologous sequences showed point mutations. This study explores the SSRs discovery and their utility as marker for species and population identification. PMID:24660911

  1. Microsporidian genome analysis reveals evolutionary strategies for obligate intracellular growth

    OpenAIRE

    Cuomo, Christina A.; Desjardins, Christopher A.; Malina A Bakowski; Goldberg, Jonathan; Ma, Amy T.; Becnel, James J.; Didier, Elizabeth S.; Fan, Lin; Heiman, David I.; Levin, Joshua Z.; Young, Sarah; Zeng, Qiandong; Emily R Troemel

    2012-01-01

    Microsporidia comprise a large phylum of obligate intracellular eukaryotes that are fungal-related parasites responsible for widespread disease, and here we address questions about microsporidia biology and evolution. We sequenced three microsporidian genomes from two species, Nematocida parisii and Nematocida sp1, which are natural pathogens of Caenorhabditis nematodes and provide model systems for studying microsporidian pathogenesis. We performed deep sequencing of transcripts from a time ...

  2. STINGRAY: system for integrated genomic resources and analysis

    OpenAIRE

    Wagner, Glauber; Jardim, Rodrigo; Tschoeke, Diogo A.; Loureiro, Daniel R; Ocaña, Kary ACS; Ribeiro, Antonio CB; Emmel, Vanessa E; Probst, Christian M.; Pitaluga, André N.; Grisard, Edmundo C.; Cavalcanti, Maria C; Campos, Maria LM; Mattoso, Marta; Dávila, Alberto MR

    2014-01-01

    Background The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. Findings STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interfac...

  3. Investigating hookworm genomes by comparative analysis of two Ancylostoma species

    OpenAIRE

    Kapulkin Wadim; Stajich Jason E; Xu Jian; Wylie Todd; Dante Mike; Martin John; Hawdon John; Arasu Prema; McCarter James P; Mitreva Makedonka; Clifton Sandra W; Waterston Robert H; Wilson Richard K

    2005-01-01

    Abstract Background Hookworms, infecting over one billion people, are the mostly closely related major human parasites to the model nematode Caenorhabditis elegans. Applying genomics techniques to these species, we analyzed 3,840 and 3,149 genes from Ancylostoma caninum and A. ceylanicum. Results Transcripts originated from libraries representing infective L3 larva, stimulated L3, arrested L3, and adults. Most genes are represented in single stages including abundant transcripts like hsp-20 i...

  4. Online Genome Analysis Resources for Educators, a Comparative Review

    Directory of Open Access Journals (Sweden)

    Sarah Grace Prescott

    2012-08-01

    Full Text Available A comparative review of several companies that offer similar kits or services that allow students to isolate DNA (human and others, amplify it by PCR, and in some cases sequence the resulting sample.  The companies include:  Carolina® Biological Supply Company, Bio-Rad®, Edvotek® Inc., Hiram Genomics Store, and 23andMe.

  5. Cancer Genome Atlas Pan-cancer Analysis Project

    OpenAIRE

    Zhang, Kun; Wang, Hong

    2015-01-01

    Cancer can exhibit different forms depending on the site of origin, cell types, the different forms of genetic mutations which also affect cancer therapeutic effect. Although many genes have been demonstrated to change a direct result of the change in phenotype, however, many cancers lineage complex molecular mechanisms are still not fully elucidated. Therefore, The Cancer Genome Atlas (TCGA) Research Network analyzed a large human tumors, in order to find the molecular changes in DNA, RNA, p...

  6. Genomic analysis of the symbiotic marine crenarchaeon, Cenarchaeumsymbiosum

    Energy Technology Data Exchange (ETDEWEB)

    Hallam, Steven J.; Konstantinidis, Konstantinos T.; Brochier,Celine; Putnam, Nik; Schleper, Christa; Watanabe, Yoh-ichi; Sugahara,Junichi; Preston, Christina; de la Torre, Jose; Richardson, Paul M.; DeLong, Edward F.

    2006-06-24

    Crenarchaea are ubiquitous and abundant microbial constituents of soils, sediments, lakes and ocean waters, yet relatively little is known about their fundamental evolutionary, ecological, and physiological properties. To better describe the ubiquitous nonthermophilic Crenarchaea, we analyzed the genome sequence of one representative, the uncultivated sponge symbiont, Cenarchaeum symbiosum. C. symbiosum genotypes coinhabiting the same host partitioned into two dominant populations, corresponding to previously described a- and b-type ribosomal RNA variants. Although synthetic, overlapping a- and b-type ribotypes harbored significant genetic variability. A single tiling path comprising the dominant a-type genotype was assembled, and used to explore the biological properties of C. symbiosum and its planktonic relatives. Out of a total of 2,066 predicted open reading frames, 36% were more highly conserved with other Archaea. The remainder partitioned between bacteria (18%), eukaryotes (1.5%) and viruses (0.1%). A total of 525 open reading frames were more highly conserved with sequences derived from marine environmental genomic surveys, most probably representing orthologous genes found in free-living planktonic Crenarchaea. The remaining genes partitioned between functional RNAs (2.4%), and hypotheticals (42%) with limited homology to known functional genes. The latter category likely contains genes specifically involved in mediated archaeal-sponge symbiosis. Phylogenetic analyses placed C. symbiosum as a basal crenarchaeon, sharing specific genomic features in common with either Crenarchaea, Euryarchaea, or both. The genome sequence of C. symbiosum reflect a unique and unusual evolutionary, physiological, and ecological history, one remarkably distinct from that of any other previously known microbial lineage.

  7. Transmissible Gastroenteritis Coronavirus Genome Packaging Signal Is Located at the 5′ End of the Genome and Promotes Viral RNA Incorporation into Virions in a Replication-Independent Process

    OpenAIRE

    Morales, L.; Mateos-Gomez, P. A.; Capiscol, C.; del Palacio, L.; Enjuanes, L; Sola, I.

    2013-01-01

    Preferential RNA packaging in coronaviruses involves the recognition of viral genomic RNA, a crucial process for viral particle morphogenesis mediated by RNA-specific sequences, known as packaging signals. An essential packaging signal component of transmissible gastroenteritis coronavirus (TGEV) has been further delimited to the first 598 nucleotides (nt) from the 5′ end of its RNA genome, by using recombinant viruses transcribing subgenomic mRNA that included potential packaging signals. Th...

  8. Genome sequencing and analysis of a granulovirus isolated from the Asiatic rice leafroller, Cnaphalocrocis medinalis.

    Science.gov (United States)

    Zhang, Shan; Zhu, Zheng; Sun, Shifeng; Chen, Qijin; Deng, Fei; Yang, Kai

    2015-12-01

    The complete genome of Cnaphalocrocis medinalis granulovirus (CnmeGV) from a serious migratory rice pest, Cnaphalocrocis medinalis (Lepidoptera: Pyralidae), was sequenced using the Roche 454 Genome Sequencer FLX system (GS FLX) with shotgun strategy and assembled by Roche GS De Novo assembler software. Its circular double-stranded genome is 111,246 bp in size with a high A+T content of 64.8% and codes for 118 putative open reading frames (ORFs). It contains 37 conserved baculovirus core ORFs, 13 unique ORFs, 26 ORFs that were found in all Lepidoptera baculoviruses and 42 common ORFs. The analysis of nucleotide sequence repeats revealed that the CnmeGV genome differs from the rest of sequenced GVs by a 23 kb and a 17kb gene block inversions, and does not contain any typical homologous region (hr) except for a region of non-hr-like sequence. Chitinase and cathepsin genes, which are reported to have major roles in the liquefaction of the hosts, were not found in the CnmeGV genome, which explains why CnmeGV infected insects do not show the phenotype of typical liquefaction. Phylogenetic analysis, based on the 37 core baculovirus genes, indicates that CnmeGV is closely related to Adoxophyes orana granulovirus. The genome analysis would contribute to the functional research of CnmeGV, and would benefit to the utilization of CnmeGV as pest control reagent for rice production. PMID:26712716

  9. Multifractal detrended cross-correlation analysis of genome sequences using chaos-game representation

    Science.gov (United States)

    Pal, Mayukha; Kiran, V. Satya; Rao, P. Madhusudana; Manimaran, P.

    2016-08-01

    We characterized the multifractal nature and power law cross-correlation between any pair of genome sequence through an integrative approach combining 2D multifractal detrended cross-correlation analysis and chaos game representation. In this paper, we have analyzed genomes of some prokaryotes and calculated fractal spectra h(q) and f(α) . From our analysis, we observed existence of multifractal nature and power law cross-correlation behavior between any pair of genome sequences. Cluster analysis was performed on the calculated scaling exponents to identify the class affiliation and the same is represented as a dendrogram. We suggest this approach may find applications in next generation sequence analysis, big data analytics etc.

  10. Analysis of Aspergillus nidulans metabolism at the genome-scale

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2008-04-01

    Full Text Available Abstract Background Aspergillus nidulans is a member of a diverse group of filamentous fungi, sharing many of the properties of its close relatives with significance in the fields of medicine, agriculture and industry. Furthermore, A. nidulans has been a classical model organism for studies of development biology and gene regulation, and thus it has become one of the best-characterized filamentous fungi. It was the first Aspergillus species to have its genome sequenced, and automated gene prediction tools predicted 9,451 open reading frames (ORFs in the genome, of which less than 10% were assigned a function. Results In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways of relevant secondary metabolites, was reconstructed based on detailed metabolic reconstructions available for A. niger and Saccharomyces cerevisiae, and information on the genetics, biochemistry and physiology of A. nidulans. Thereby, it was possible to identify metabolic functions without a gene associated, and to look for candidate ORFs in the genome of A. nidulans by comparing its sequence to sequences of well-characterized genes in other species encoding the function of interest. A classification system, based on defined criteria, was developed for evaluating and selecting the ORFs among the candidates, in an objective and systematic manner. The functional assignments served as a basis to develop a mathematical model, linking 666 genes (both previously and newly annotated to metabolic roles. The model was used to simulate metabolic behavior and additionally to integrate, analyze and interpret large-scale gene expression data concerning a study on glucose repression, thereby providing a means of upgrading the information content of experimental data

  11. Comparative analysis of A, B,C and D genomes in the genus Oryza with C0t-1 DNA of C genome

    Institute of Scientific and Technical Information of China (English)

    LAN Weizhen; QIN Rui; LI Gang; HE Guangcun

    2006-01-01

    Fluorescence in situ hybridization (FISH)was applied to somatic chromosomes preparations of Oryza officinalis Wall. (CC), O. sativa L. (AA)×O. officinalis F1 hybrid (AC), backcross progenies BC1 (AAC and ACC), O. latifolia Desv. (CCDD), O. alta Swallen (CCDD) and O. punctata Kotschy (BBCC)with a labelled probe of Cot-1 DNA from O. officinalis.In O. officinalis, the homologous chromosomes showed similar signal bands probed by C0t-1 DNA and karyotype analysis was conducted based on the band patterns. Using no blocking DNA, the probe identified the chromosomes of C genome clearly, but detected few signals on chromosomes of A genome in the F1 hybrid and two backcross progenies of BC1.It is obvious that the highly and moderately repetitive DNA sequences were considerably different between C and A genomes. The chromosomes of C genome were also discriminated from the chromosomes of D-and B-genome in the tetraploid species O. latifolia, O.alta and O. punctata by C0t-1 DNA-FISH. Comparison of the fluorescence intensity on the chromosomes of B, C and D genomes in O. latifolia, O. alta,and O. punctata indicated that the differentiations between C and D genomes are less than that between C and B genomes. The relationship between C and D genomes in O. alta is closer than that of C and D genomes in O. latifolia. This would be one of the causes for the fact that both the genomes are of the same karyotype (CCDD) but belong to different species. The above results showed that the C0t-1 DNA had a high specificity of genome and species. In this paper, the origin of allotetraploid in genus Oryza is also discussed.

  12. Deoxyhexanucleotide containing a vinyl chloride induced DNA lesion, 1,N6-ethenoadenine: synthesis, physical characterization, and incorporation into a duplex bacteriophage M13 genome as part of an amber codon

    International Nuclear Information System (INIS)

    Organic synthesis and recombinant DNA techniques have been used to situate a single 1,N6-ethenoadenine (epsilon Ade) DNA adduct at an amber codon in the genome of an M13mp19 phage derivative. The deoxyhexanucleotide d[GCT(epsilon A)GC] was chemically synthesized by the phosphotriester method. Physical studies involving fluorescence, circular dichroism , and 1H NMR indicated epsilon Ade to be very efficiently stacked in the hexamer, especially with the 5'-thymine. Melting profile and circular dichroism studies provided evidence of the loss of base-pairing capabilities attendant with formation of the etheno ring. The modified hexanucleotide was incorporated into a six-base gap formed in the genome of an M13mp19 insertion mutant. Phage of the insertion mutant, M13mp19-NheI, produced light blue plaques on SupE strains because of the introduced amber codon. Formation of a hybrid between the single-strand DNA (plus strand) of M13mp19-NheI with SmaI-linearized M13mp19 replicative form produced a heteroduplex with a six-base gap in the minus strand. The modified hexamer [5'-32P]d-[GCT(epsilon A)GC], after 5'-phosphorylation, was ligated into this gap by using bacteriophage T4 DNA ligase to generate a singly adducted genome with epsilon Ade at minus strand position 6274. Introduction of the radiolabel provided a useful marker for characterization of the singly adducted genome, and indeed the label appeared in the anticipated fragments when digested by several restriction endonucleases. Evidence that ligation occurred on both 5' and 3' sides of the oligonucleotide also was obtained. The M13mp19-NheI genome containing epsilon Ade will be used as a probe for studying mutagenesis and repair of this DNA adduct in Escherichia coli

  13. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2005-05-01

    Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.

  14. Phylogenetic analysis of the genomes of two strains of human adenovirus type 3

    Institute of Scientific and Technical Information of China (English)

    RONG ZHOU; XIAO Bo SU; QI WEI ZIIANG; QI YI ZENG; BING ZHU; CHU Yu ZHANG; Hou Bo WU; ZAO HE WU; SI TANG GONG

    2007-01-01

    Human adenovirus type 3 (HAdV-3) is widely prevalent all over the world, especially in Asia. The objective of this study is to carry out complete genomic DNA sequencing and the phylogenetic analysis for two strains (Guangzhou01 and Guangzhou02) of HAdV-3 wild virus isolated from South China. Nasopharyngeal secretion aspirate specimens of sick children were inoculated into HEp-2 and HeLa culture tubes, and the cultures were identified by neutralization assay with type-specific reference rabbit antisermn. Type-specific primers were also utilized to confirm the serotype. The restriction fragments of HAdV genome DNA were cloned into pBlueScript SK ( + ) vectors and sequenced, and the 5' and 3'ends of the linear HAdV-3 genome were directly sequenced with double purified genomic DNA as templates. General features of the HAdV-3 genome sequences were explored by using several bio-software.Phylogenetic analysis was done with MEGA 3.0 software. The genomic sequences of Guangzhou01 and Guangzhou02 possess the same 4 early regions and 5 late regions and have 39 ceding sequences and two RNA coding sequences. Other non-ceding regions are conservative. Inverted repeats and palindromes were identified in the genome sequences. The genomes of group B human adenovirus as well as HAdV-3have close phylogenetic relationship with that of chimpanzee adenovirus type 21. The genomie lengths of these two isolated strains are 35 273 bp and 35 269 bp, respectively. The phylogenetie analysis showed that HAdV-B species has some relationship with eertain types of chimpanzee adenovirus.

  15. Isolation and genomic analysis of circulating tumor cells from castration resistant metastatic prostate cancer

    International Nuclear Information System (INIS)

    The number of circulating tumor cells (CTCs) in metastatic prostate cancer patients provides prognostic and predictive information. However, it is the molecular characterization of CTCs that offers insight into the biology of these tumor cells in the context of personalized treatment. We developed a novel approach to isolate CTCs away from hematopoietic cells with high purity, enabling genomic analysis of these cells. The isolation protocol involves immunomagnetic enrichment followed by fluorescence activated cell sorting (IE/FACS). To evaluate the feasibility of isolation of CTCs by IE/FACS and downstream genomic profiling, we conducted a pilot study in patients with metastatic castration resistant prostate cancer (CRPC). Twenty (20) sequential CRPC patients were assayed using CellSearch™. Twelve (12) patients positive for CTCs were subjected to immunomagnetic enrichment and fluorescence activated cell sorting (IE/FACS) to isolate CTCs. Genomic DNA of CTCs was subjected to whole genome amplification (WGA) followed by gene copy number analysis via array comparative genomic hybridization (aCGH). CTCs from nine (9) patients successfully profiled were observed to have multiple copy number aberrations including those previously reported in primary prostate tumors such as gains in 8q and losses in 8p. High-level copy number gains at the androgen receptor (AR) locus were observed in 7 (78%) cases. Comparison of genomic profiles between CTCs and archival primary tumors from the same patients revealed common lineage. However, high-level copy number gains in the AR locus were observed in CTCs, but not in the matched archival primary tumors. We developed a new approach to isolate prostate CTCs without significant leukocyte admixture, and to subject them to genome-wide copy number analysis. Our assay may be utilized to explore genomic events involved in cancer progression, e.g. development of castration resistance and to monitor therapeutic efficacy of targeted therapies in

  16. Survey and analysis of simple sequence repeats (SSRs) in three genomes of Candida species.

    Science.gov (United States)

    Jia, Dongmei

    2016-06-15

    Simple sequence repeats (SSRs) or microsatellites, which composed of tandem repeated short units of 1-6bp, have been paying attention continuously. Here, the distribution, composition and polymorphism of microsatellites and compound microsatellites were analyzed in three available genomes of Candida species (Candida dubliniensis, Candida glabrata and Candida orthopsilosis). The results show that there were 118,047, 66,259 and 61,119 microsatellites in genomes of C. dubliniensis, C. glabrata and C. orthopsilosis, respectively. The SSRs covered more than 1/3 length of genomes in the three species. The microsatellites, which just consist of bases A and (or) T, such as (A)n, (T)n, (AT)n, (TA)n, (AAT)n, (TAA)n, (TTA)n, (ATA)n, (ATT)n and (TAT)n, were predominant in the three genomes. The length of microsatellites was focused on 6bp and 9bp either in the three genomes or in its coding sequences. What's more, the relative abundance (19.89/kbp) and relative density (167.87bp/kbp) of SSRs in sequence of mitochondrion of C. glabrata were significantly great than that in any one of genomes or chromosomes of the three species. In addition, the distance between any two adjacent microsatellites was an important factor to influence the formation of compound microsatellites. The analysis may be helpful for further studying the roles of microsatellites in genomes' origination, organization and evolution of Candida species. PMID:26883055

  17. Microsatellite analysis in the genome of Acanthaceae: An in silico approach

    Directory of Open Access Journals (Sweden)

    Priyadharsini Kaliswamy

    2015-01-01

    Full Text Available Background: Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. Objective: The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. Materials and Methods: The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Results: Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. Conclusion: The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future.

  18. Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

    Science.gov (United States)

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  19. Comparative genomic hybridization analysis of benign and invasive male breast neoplasms

    DEFF Research Database (Denmark)

    Ojopi, Elida Paula Benquique; Cavalli, Luciane Regina; Cavalieri, Luciane Mara Bogline;

    2002-01-01

    Comparative genomic hybridization (CGH) analysis was performed for the identification of chromosomal imbalances in two benign gynecomastias and one malignant breast carcinoma derived from patients with male breast disease and compared with cytogenetic analysis in two of the three cases. CGH analy...

  20. The complexity of Rhipicephalus (Boophilus microplus genome characterised through detailed analysis of two BAC clones

    Directory of Open Access Journals (Sweden)

    Valle Manuel

    2011-07-01

    Full Text Available Abstract Background Rhipicephalus (Boophilus microplus (Rmi a major cattle ectoparasite and tick borne disease vector, impacts on animal welfare and industry productivity. In arthropod research there is an absence of a complete Chelicerate genome, which includes ticks, mites, spiders, scorpions and crustaceans. Model arthropod genomes such as Drosophila and Anopheles are too taxonomically distant for a reference in tick genomic sequence analysis. This study focuses on the de-novo assembly of two R. microplus BAC sequences from the understudied R microplus genome. Based on available R. microplus sequenced resources and comparative analysis, tick genomic structure and functional predictions identify complex gene structures and genomic targets expressed during tick-cattle interaction. Results In our BAC analyses we have assembled, using the correct positioning of BAC end sequences and transcript sequences, two challenging genomic regions. Cot DNA fractions compared to the BAC sequences confirmed a highly repetitive BAC sequence BM-012-E08 and a low repetitive BAC sequence BM-005-G14 which was gene rich and contained short interspersed elements (SINEs. Based directly on the BAC and Cot data comparisons, the genome wide frequency of the SINE Ruka element was estimated. Using a conservative approach to the assembly of the highly repetitive BM-012-E08, the sequence was de-convoluted into three repeat units, each unit containing an 18S, 5.8S and 28S ribosomal RNA (rRNA encoding gene sequence (rDNA, related internal transcribed spacer and complex intergenic region. In the low repetitive BM-005-G14, a novel gene complex was found between to 2 genes on the same strand. Nested in the second intron of a large 9 Kb papilin gene was a helicase gene. This helicase overlapped in two exonic regions with the papilin. Both these genes were shown expressed in different tick life stage important in ectoparasite interaction with the host. Tick specific sequence

  1. Evaluation of a Two-Stage Approach in Trans-Ethnic Meta-Analysis in Genome-Wide Association Studies.

    Science.gov (United States)

    Hong, Jaeyoung; Lunetta, Kathryn L; Cupples, L Adrienne; Dupuis, Josée; Liu, Ching-Ti

    2016-05-01

    Meta-analysis of genome-wide association studies (GWAS) has achieved great success in detecting loci underlying human diseases. Incorporating GWAS results from diverse ethnic populations for meta-analysis, however, remains challenging because of the possible heterogeneity across studies. Conventional fixed-effects (FE) or random-effects (RE) methods may not be most suitable to aggregate multiethnic GWAS results because of violation of the homogeneous effect assumption across studies (FE) or low power to detect signals (RE). Three recently proposed methods, modified RE (RE-HE) model, binary-effects (BE) model and a Bayesian approach (Meta-analysis of Transethnic Association [MANTRA]), show increased power over FE and RE methods while incorporating heterogeneity of effects when meta-analyzing trans-ethnic GWAS results. We propose a two-stage approach to account for heterogeneity in trans-ethnic meta-analysis in which we clustered studies with cohort-specific ancestry information prior to meta-analysis. We compare this to a no-prior-clustering (crude) approach, evaluating type I error and power of these two strategies, in an extensive simulation study to investigate whether the two-stage approach offers any improvements over the crude approach. We find that the two-stage approach and the crude approach for all five methods (FE, RE, RE-HE, BE, MANTRA) provide well-controlled type I error. However, the two-stage approach shows increased power for BE and RE-HE, and similar power for MANTRA and FE compared to their corresponding crude approach, especially when there is heterogeneity across the multiethnic GWAS results. These results suggest that prior clustering in the two-stage approach can be an effective and efficient intermediate step in meta-analysis to account for the multiethnic heterogeneity. PMID:27061095

  2. mGenomeSubtractor: a web-based tool for parallel in silico subtractive hybridization analysis of multiple bacterial genomes.

    Science.gov (United States)

    Shao, Yucheng; He, Xinyi; Harrison, Ewan M; Tai, Cui; Ou, Hong-Yu; Rajakumar, Kumar; Deng, Zixin

    2010-07-01

    mGenomeSubtractor performs an mpiBLAST-based comparison of reference bacterial genomes against multiple user-selected genomes for investigation of strain variable accessory regions. With parallel computing architecture, mGenomeSubtractor is able to run rapid BLAST searches of the segmented reference genome against multiple subject genomes at the DNA or amino acid level within a minute. In addition to comparison of protein coding sequences, the highly flexible sliding window-based genome fragmentation approach offered can be used to identify short unique sequences within or between genes. mGenomeSubtractor provides powerful schematic outputs for exploration of identified core and accessory regions, including searches against databases of mobile genetic elements, virulence factors or bacterial essential genes, examination of G+C content and binucleotide distribution bias, and integrated primer design tools. mGenomeSubtractor also allows for the ready definition of species-specific gene pools based on available genomes. Pan-genomic arrays can be easily developed using the efficient oligonucleotide design tool. This simple high-throughput in silico 'subtractive hybridization' analytical tool will support the rapidly escalating number of comparative bacterial genomics studies aimed at defining genomic biomarkers of evolutionary lineage, phenotype, pathotype, environmental adaptation and/or disease-association of diverse bacterial species. mGenomeSubtractor is freely available to all users without any login requirement at: http://bioinfo-mml.sjtu.edu.cn/mGS/. PMID:20435682

  3. Comparative mitochondrial genome analysis reveals the evolutionary rearrangement mechanism in Brassica.

    Science.gov (United States)

    Yang, J; Liu, G; Zhao, N; Chen, S; Liu, D; Ma, W; Hu, Z; Zhang, M

    2016-05-01

    The genus Brassica has many species that are important for oil, vegetable and other food products. Three mitochondrial genome types (mitotype) originated from its common ancestor. In this paper, a B. nigra mitochondrial main circle genome with 232,407 bp was generated through de novo assembly. Synteny analysis showed that the mitochondrial genomes of B. rapa and B. oleracea had a better syntenic relationship than B. nigra. Principal components analysis and development of a phylogenetic tree indicated maternal ancestors of three allotetraploid species in Us triangle of Brassica. Diversified mitotypes were found in allotetraploid B. napus, in which napus-type B. napus was derived from B. oleracea, while polima-type B. napus was inherited from B. rapa. In addition, the mitochondrial genome of napus-type B. napus was closer to botrytis-type than capitata-type B. oleracea. The sub-stoichiometric shifting of several mitochondrial genes suggested that mitochondrial genome rearrangement underwent evolutionary selection during domestication and/or plant breeding. Our findings clarify the role of diploid species in the maternal origin of allotetraploid species in Brassica and suggest the possibility of breeding selection of the mitochondrial genome. PMID:27079962

  4. Functional analysis of gapped microbial genomes: amino acid metabolism of Thiobacillus ferrooxidans.

    Science.gov (United States)

    Selkov, E; Overbeek, R; Kogan, Y; Chu, L; Vonstein, V; Holmes, D; Silver, S; Haselkorn, R; Fonstein, M

    2000-03-28

    A gapped genome sequence of the biomining bacterium Thiobacillus ferrooxidans strain ATCC23270 was assembled from sheared DNA fragments (3.2-times coverage) into 1,912 contigs. A total of 2,712 potential genes (ORFs) were identified in 2.6 Mbp (megabase pairs) of Thiobacillus genomic sequence. Of these genes, 2,159 could be assigned functions by using the WIT-Pro/EMP genome analysis system, most with a high degree of certainty. Nine hundred of the genes have been assigned roles in metabolic pathways, producing an overview of cellular biosynthesis, bioenergetics, and catabolism. Sequence similarities, relative gene positions on the chromosome, and metabolic reconstruction (placement of gene products in metabolic pathways) were all used to aid gene assignments and for development of a functional overview. Amino acid biosynthesis was chosen to demonstrate the analytical capabilities of this approach. Only 10 expected enzymatic activities, of the nearly 150 involved in the biosynthesis of all 20 amino acids, are currently unassigned in the Thiobacillus genome. This result compares favorably with 10 missing genes for amino acid biosynthesis in the complete Escherichia coli genome. Gapped genome analysis can therefore give a decent picture of the central metabolism of a microorganism, equivalent to that of a complete sequence, at significantly lower cost. PMID:10737802

  5. Genomic analysis in the clinic: benefits and challenges for health care professionals and patients in Brazil.

    Science.gov (United States)

    Ashton-Prolla, Patrícia; Goldim, José Roberto; Vairo, Filippo Pinto E; da Silveira Matte, Ursula; Sequeiros, Jorge

    2015-07-01

    Despite significant advances in the diagnosis and treatment of genetic diseases in the last two decades, there is still a significant proportion where a causative mutation cannot be identified and a definitive genetic diagnosis remains elusive. New genome-wide or high-throughput multiple gene tests have brought new hope to the field, since they can offer fast, cost-effective and comprehensive analysis of genetic variation. This is particularly interesting in disorders with high genetic heterogeneity. There are, however, limitations and concerns regarding the implementation of genomic analysis in everyday clinical practice, including some particular to emerging and developing economies, as Brazil. They include the limited number of actionable genetic variants known to date, difficulties in determining the clinical validity and utility of novel variants, growth of direct-to-consumer genetic testing using a genomic approach and lack of proper training of health care professionals to adequately request, interpret and use genetic information. Despite all these concerns and limitations, the availability of genomic tests has grown at an extremely rapid pace and commercially available services include initiatives in almost all areas of clinical genetics, including newborn and carrier screening. We discuss the benefits and limitations of genomic testing, as well as the ethical implications and the challenges for genetic education and enough available and qualified health care professionals, to ensure the adequate process of informed consent, meaningful interpretation and use of genomic data and definition of a clear regulatory framework in the particular context of Brazil. PMID:26040235

  6. Comprehensive genomic analysis of a BRCA2 deficient human pancreatic cancer.

    Directory of Open Access Journals (Sweden)

    Louise J Barber

    Full Text Available Capan-1 is a well-characterised BRCA2-deficient human cell line isolated from a liver metastasis of a pancreatic adenocarcinoma. Here we report a genome-wide assessment of structural variations and high-depth exome characterization of single nucleotide variants and small insertion/deletions in Capan-1. To identify potential somatic and tumour-associated variations in the absence of a matched-normal cell line, we devised a novel method based on the analysis of HapMap samples. We demonstrate that Capan-1 has one of the most rearranged genomes sequenced to date. Furthermore, small insertions and deletions are detected more frequently in the context of short sequence repeats than in other genomes. We also identify a number of novel mutations that may represent genetic changes that have contributed to tumour progression. These data provide insight into the genomic effects of loss of BRCA2 function.

  7. Comprehensive Genomic Analysis of a BRCA2 Deficient Human Pancreatic Cancer

    Science.gov (United States)

    Kozarewa, Iwanka; Fenwick, Kerry; Assiotis, Ioannis; Mitsopoulos, Costas; Sims, David; Hakas, Jarle; Zvelebil, Marketa; Lord, Christopher J.; Ashworth, Alan

    2011-01-01

    Capan-1 is a well-characterised BRCA2-deficient human cell line isolated from a liver metastasis of a pancreatic adenocarcinoma. Here we report a genome-wide assessment of structural variations and high-depth exome characterization of single nucleotide variants and small insertion/deletions in Capan-1. To identify potential somatic and tumour-associated variations in the absence of a matched-normal cell line, we devised a novel method based on the analysis of HapMap samples. We demonstrate that Capan-1 has one of the most rearranged genomes sequenced to date. Furthermore, small insertions and deletions are detected more frequently in the context of short sequence repeats than in other genomes. We also identify a number of novel mutations that may represent genetic changes that have contributed to tumour progression. These data provide insight into the genomic effects of loss of BRCA2 function. PMID:21750719

  8. Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer.

    Science.gov (United States)

    Murchison, Elizabeth P; Schulz-Trieglaff, Ole B; Ning, Zemin; Alexandrov, Ludmil B; Bauer, Markus J; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R; Cheetham, R Keira; Cheng, William; Connor, Thomas R; Cox, Anthony J; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J; Harris, Simon R; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J; Wedge, David C; Woods, Gregory M; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M J; Carter, Nigel P; Papenfuss, Anthony T; Futreal, P Andrew; Campbell, Peter J; Yang, Fengtang; Bentley, David R; Evers, Dirk J; Stratton, Michael R

    2012-02-17

    The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PMID:22341448

  9. DivStat: a user-friendly tool for single nucleotide polymorphism analysis of genomic diversity.

    Directory of Open Access Journals (Sweden)

    Inês Soares

    Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.

  10. Genome-wide linkage analysis for human longevity

    DEFF Research Database (Denmark)

    Beekman, Marian; Blanché, Hélène; Perola, Markus;

    2013-01-01

    Clear evidence exists for heritability of human longevity, and much interest is focused on identifying genes associated with longer lives. To identify such longevity alleles, we performed the largest genome-wide linkage scan thus far reported. Linkage analyses included 2118 nonagenarian Caucasian...... sibling pairs that have been enrolled in 15 study centers of 11 European countries as part of the Genetics of Healthy Aging (GEHA) project. In the joint linkage analyses, we observed four regions that show linkage with longevity; chromosome 14q11.2 (LOD = 3.47), chromosome 17q12-q22 (LOD = 2...

  11. Genomic analysis of the major bovine milk protein genes.

    OpenAIRE

    Threadgill, D.W.; Womack, J E

    1990-01-01

    The genomic arrangement of the major bovine milk protein genes has been determined using a combination of physical mapping techniques. The major milk proteins consist of the four caseins, alpha s1 (CASAS1), alpha s2 (CASAS2), beta (CASB), and kappa (CASK), as well as the two major whey proteins, alpha-lactalbumin (LALBA) and beta-lactoglobulin (LGB). A panel of bovine X hamster hybrid somatic cells analyzed for the presence or absence of bovine specific restriction fragments revealed the gene...

  12. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan

    2012-02-17

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse\\'s genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  13. Optical properties and Surface analysis of lithium incorporated Y2O3:Eu3+ ceramic phosphors

    Directory of Open Access Journals (Sweden)

    Jong-Seong Bae

    2010-09-01

    Full Text Available The influence of lithium doping concentration on the crystallization, the surface morphology, and the luminescent properties of Y1.92O3:Eu0.08 ceramic phosphors were investigated. The crystallinity, the surface morphology, and the photoluminescence (PL of ceramics depended highly on the Li-doping concentrations. The relationship between the crystalline and morphological structures and the luminescent properties was studied, and Li-doping was found to affect not only the enhanced crystallinity but also the luminescent brightness of Y1.92O3:Eu0.08 ceramics. In particular, the incorporation of Li+ ion into the Y2O3 lattice could induce remarkable increase in the PL intensity. The strongest emission intensity was observed with Y1.68Li0.24O3-δ:Eu0.08 ceramics whose brightness was increased by a factor of 6.5 in comparison with that of Y1.92O3:Eu0.08 ceramics.

  14. Analysis on the Pathogenesis of Symptomatic Pulmonary Embolism with Human Genomics

    Directory of Open Access Journals (Sweden)

    Hao Wang, Qianglin Duan, Lemin Wang, Zhu Gong, Aibin Liang, Qiang Wang, Haoming Song, Fan Yang, Yanli Song

    2012-01-01

    Full Text Available BACKGROUND: In the present study, the whole human genome oligo microarray was employed to investigate the gene expression profile in symptomatic pulmonary embolism (PE.METHODS: Twenty patients with PE and 20 age and gender matched patients without PE as controls were enrolled into the present study in the same period. The diagnosis of PE was based on the clinical manifestations and findings on imaging examinations. Acute arterial and/or venous thrombosis was excluded in controls. The whole human genome oligo microarray was employed for detection. Statistical analysis was performed with t test following analysis of very small samples of repeated measurements and Gene Ontology (GO analysis.RESULTS: Genomic data showed no damage to vascular endothelial cells in PE patients. Genomic data only found increased mRNA expression of a small amount of coagulation factors in PE patients. In the PE group, anticoagulant proteins, Fibrinolytic system and proteins related to platelet functions only played partial roles in the pathogenesis of PE. In addition, the mRNA expressions of a fraction of adhesion molecules were markedly up-regulated. Gene Ontology analysis showed the genes with down-regulated expressions mainly explain the compromised T cell immunity. Symptomatic VTE patients have compromised T cell immunity.CONCLUSION: The damage to vascular endothelial cells is not necessary in the pathogenesis of VTE, and only a fraction of factors involved in the shared coagulation cascade are activated. Genomic results may provide a new clue for clinical diagnosis, treatment and prevention of VTE.

  15. solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database

    Directory of Open Access Journals (Sweden)

    van der Knaap Esther

    2010-10-01

    Full Text Available Abstract Background A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL. Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases. Description The Sol Genomics Network (SGN, http://solgenomics.net is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application. Conclusions solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes

  16. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  17. Genome-wide analysis of the synonymous codon usage patterns in apple

    Institute of Scientific and Technical Information of China (English)

    LI Ning; SUN Mei-hong; JIANG Ze-sheng; SHU Huai-rui; ZHANG Shi-zhong

    2016-01-01

    Apple (Malus×domestica) has been proposed as an important woody plant and the major cultivated fruit trees in temperate regions. Apple whole genome sequencing has been completed, which provided an excelent opportunity for genome-wide analysis of the synonymous codon usage patterns. In this study, a multivariate bioinformatics analysis was performed to reveal the characteristics of synonymous codon usage and the main factors affecting codon bias in apple. The neutrality, correspondence, and correlation analyses were performed by CodonW and SPSS (Statistical Product and Service Solu-tions) programs, indicating that the apple genome codon usage patterns were affected by mutational pressure and selective constraint. Meanwhile, coding sequence length and the hydrophobicity of proteins could also inlfuence the codon usage patterns. In short, codon usage pattern analysis and determination of optimal codons has laid an important theoretical basis for genetic engineering, gene prediction and molecular evolution studies in apple.

  18. Growth characteristics and complete genomic sequence analysis of a novel pseudorabies virus in China.

    Science.gov (United States)

    Yu, Teng; Chen, Fangzhou; Ku, Xugang; Fan, Jie; Zhu, Yinxing; Ma, Hailong; Li, Subei; Wu, Bin; He, Qigai

    2016-08-01

    Swine pseudorabies (PR) re-emerged in Bartha-vaccinated pig herds and caused death of millions of piglets in China since the later part of 2011. We isolated a novel pseudorabies virus (PRV), named HNX strain, from the brain of abortion fetuses to diagnose the disease. To reveal the genomic organization and characterize the HNX strain, the complete genomes of HNX and Fa strain, an isolate in the 1960s, were sequenced and analyzed. The genomic size of HNX and Fa strains were 142,294 and 141,930 nt, respectively, with corresponding G + C contents of 73.56 and 73.70 %. The two strains consistently possessed 70 open reading frames. In addition, comparative genomic analysis between HNX and Bartha strains was performed to understand the possible reason of immune failure. The major virulence-associated genes of HNX strain had slight changes, whereas glycoprotein B and glycoprotein C genes of HNX strain had 73 mutations; the homology at the whole genomic level between HNX and Bartha strains was 90.6 %. Genome-wide comparison between HNX and Fa strains indicated that the strains shared about 96.4 % of homology and clustered in a separate Chinese isolate group; the two strains are also distant from the isolates from other countries. Similarity plot and bootscanning analysis of complete genome sequences of nine PRV strains, including HNX and Fa, four newly Chinese strains, and three traditional reference strains, revealed that non-recombination events occurred in the HNX strain. The PRV HNX strain with genomic variations might contribute to the PR outbreak in China since the later part of 2011. PMID:27012685

  19. Connecting Anxiety and Genomic Copy Number Variation: A Genome-Wide Analysis in CD-1 Mice.

    Directory of Open Access Journals (Sweden)

    Julia Brenndörfer

    Full Text Available Genomic copy number variants (CNVs have been implicated in multiple psychiatric disorders, but not much is known about their influence on anxiety disorders specifically. Using next-generation sequencing (NGS and two additional array-based genotyping approaches, we detected CNVs in a mouse model consisting of two inbred mouse lines showing high (HAB and low (LAB anxiety-related behavior, respectively. An influence of CNVs on gene expression in the central (CeA and basolateral (BLA amygdala, paraventricular nucleus (PVN, and cingulate cortex (Cg was shown by a two-proportion Z-test (p = 1.6 x 10-31, with a positive correlation in the CeA (p = 0.0062, PVN (p = 0.0046 and Cg (p = 0.0114, indicating a contribution of CNVs to the genetic predisposition to trait anxiety in the specific context of HAB/LAB mice. In order to confirm anxiety-relevant CNVs and corresponding genes in a second mouse model, we further examined CD-1 outbred mice. We revealed the distribution of CNVs by genotyping 64 CD 1 individuals using a high-density genotyping array (Jackson Laboratory. 78 genes within those CNVs were identified to show nominally significant association (48 genes, or a statistical trend in their association (30 genes with the time animals spent on the open arms of the elevated plus-maze (EPM. Fifteen of them were considered promising candidate genes of anxiety-related behavior as we could show a significant overlap (permutation test, p = 0.0051 with genes within HAB/LAB CNVs. Thus, here we provide what is to our knowledge the first extensive catalogue of CNVs in CD-1 mice and potential corresponding candidate genes linked to anxiety-related behavior in mice.

  20. A simple and inexpensive method for genomic restriction mapping analysis

    International Nuclear Information System (INIS)

    The Southern blotting procedure for the transfer of DNA fragments from agarose gels to nitrocellulose membranes has revolutionized nucleic acid detection methods, and it forms the cornerstone of research in molecular biology. Basically, the method involves the denaturation of DNA fragments that have been separated on an agarose gel, the immobilization of the fragments by transfer to a nitrocellulose membrane, and the identification of the fragments of interest through hybridization to /sup 32/P-labeled probes and autoradiography. While the method is sensitive and applicable to both genomic and cloned DNA, it suffers from the disadvantages of being time consuming and expensive, and fragments of greater than 15 kb are difficult to transfer. Moreover, although theoretically the nitrocellulose membrane can be washed and hybridized repeatedly using different probes, in practice, the membrane becomes brittle and difficult to handle after a few cycles. A direct hybridization method for pure DNA clones was developed in 1975 but has not been widely exploited. The authors report here a modification of their procedure as applied to genomic DNA. The method is simple, rapid, and inexpensive, and it does not involve transfer to nitrocellulose membranes

  1. Analysis of individual differences in radiosensitivity using genome editing.

    Science.gov (United States)

    Matsuura, S; Royba, E; Akutsu, S N; Yanagihara, H; Ochiai, H; Kudo, Y; Tashiro, S; Miyamoto, T

    2016-06-01

    Current standards for radiological protection of the public have been uniformly established. However, individual differences in radiosensitivity are suggested to exist in human populations, which could be caused by nucleotide variants of DNA repair genes. In order to verify if such genetic variants are responsible for individual differences in radiosensitivity, they could be introduced into cultured human cells for evaluation. This strategy would make it possible to analyse the effect of candidate nucleotide variants on individual radiosensitivity, independent of the diverse genetic background. However, efficient gene targeting in cultured human cells is difficult due to the low frequency of homologous recombination (HR) repair. The development of artificial nucleases has enabled efficient HR-mediated genome editing to be performed in cultured human cells. A novel genome editing strategy, 'transcription activator-like effector nuclease (TALEN)-mediated two-step single base pair editing', has been developed, and this was used to introduce a nucleotide variant associated with a chromosomal instability syndrome bi-allelically into cultured human cells to demonstrate that it is the causative mutation. It is proposed that this editing technique will be useful to investigate individual radiosensitivity. PMID:27012844

  2. Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Oikawa Masahiro

    2011-12-01

    Full Text Available Abstract Background It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN, which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH. Methods Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested. Results The mean of the derivative log ratio spread (DLRSpread, which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05. The concordance of results between aCGH and fluorescence in situ hybridization (FISH for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively. The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15. Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40. Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005 independent factor which was associated with larger total length of CNA of breast cancers. Conclusions Thus, archival FFPE tissues from A-bomb survivors are useful for

  3. Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization

    International Nuclear Information System (INIS)

    It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb) survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN), which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE) tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH). Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA) was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested. The mean of the derivative log ratio spread (DLRSpread), which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05). The concordance of results between aCGH and fluorescence in situ hybridization (FISH) for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively). The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15). Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40). Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005) independent factor which was associated with larger total length of CNA of breast cancers. Thus, archival FFPE tissues from A-bomb survivors are useful for genome-wide aCGH analysis. Our results suggested that A

  4. Forward-looking activities: incorporating citizens' visions: A critical analysis of the CIVISTI method.

    Science.gov (United States)

    Gudowsky, Niklas; Peissl, Walter; Sotoudeh, Mahshid; Bechtold, Ulrike

    2012-11-01

    Looking back on the many prophets who tried to predict the future as if it were predetermined, at first sight any forward-looking activity is reminiscent of making predictions with a crystal ball. In contrast to fortune tellers, today's exercises do not predict, but try to show different paths that an open future could take. A key motivation to undertake forward-looking activities is broadening the information basis for decision-makers to help them actively shape the future in a desired way. Experts, laypeople, or stakeholders may have different sets of values and priorities with regard to pending decisions on any issue related to the future. Therefore, considering and incorporating their views can, in the best case scenario, lead to more robust decisions and strategies. However, transferring this plurality into a form that decision-makers can consider is a challenge in terms of both design and facilitation of participatory processes. In this paper, we will introduce and critically assess a new qualitative method for forward-looking activities, namely CIVISTI (Citizen Visions on Science, Technology and Innovation; www.civisti.org), which was developed during an EU project of the same name. Focussing strongly on participation, with clear roles for citizens and experts, the method combines expert, stakeholder and lay knowledge to elaborate recommendations for decision-making in issues related to today's and tomorrow's science, technology and innovation. Consisting of three steps, the process starts with citizens' visions of a future 30-40 years from now. Experts then translate these visions into practical recommendations which the same citizens then validate and prioritise to produce a final product. The following paper will highlight the added value as well as limits of the CIVISTI method and will illustrate potential for the improvement of future processes. PMID:23204998

  5. A Fully Automated and Robust Method to Incorporate Stamping Data in Crash, NVH and Durability Analysis

    Science.gov (United States)

    Palaniswamy, Hariharasudhan; Kanthadai, Narayan; Roy, Subir; Beauchesne, Erwan

    2011-08-01

    Crash, NVH (Noise, Vibration, Harshness), and durability analysis are commonly deployed in structural CAE analysis for mechanical design of components especially in the automotive industry. Components manufactured by stamping constitute a major portion of the automotive structure. In CAE analysis they are modeled at a nominal state with uniform thickness and no residual stresses and strains. However, in reality the stamped components have non-uniformly distributed thickness and residual stresses and strains resulting from stamping. It is essential to consider the stamping information in CAE analysis to accurately model the behavior of the sheet metal structures under different loading conditions. Especially with the current emphasis on weight reduction by replacing conventional steels with aluminum and advanced high strength steels it is imperative to avoid over design. Considering this growing need in industry, a highly automated and robust method has been integrated within Altair Hyperworks® to initialize sheet metal components in CAE models with stamping data. This paper demonstrates this new feature and the influence of stamping data for a full car frontal crash analysis.

  6. Exhalation analysis as a method of monitoring the incorporation of thorium

    International Nuclear Information System (INIS)

    The measurement techniques of whole body counting and analysis of excretion are little suited for the routine monitoring of Thorium body activity. From published data the suitability of exhalation analysis was examined. By using the most sensitiv Thoronmonitors it is possible to determine approximately 1 mBq/l Rn-220 in breath of humans. The relationship between exhaled Rn-220 and the Th-232 activity deposited in liver and spleen of Thorotrast patients was about 2-3 mBq/l Rn-220 per Bq Th-232. Corresponding to this data by exhalation analysis the estimation of 0.5 to 1 Bq Th-232 body burden (equivalent to 30% annual limit for intake of Th-232) is practicable. (orig.)

  7. Genomic analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea.

    Science.gov (United States)

    Amselem, Joelle; Cuomo, Christina A; van Kan, Jan A L; Viaud, Muriel; Benito, Ernesto P; Couloux, Arnaud; Coutinho, Pedro M; de Vries, Ronald P; Dyer, Paul S; Fillinger, Sabine; Fournier, Elisabeth; Gout, Lilian; Hahn, Matthias; Kohn, Linda; Lapalu, Nicolas; Plummer, Kim M; Pradier, Jean-Marc; Quévillon, Emmanuel; Sharon, Amir; Simon, Adeline; ten Have, Arjen; Tudzynski, Bettina; Tudzynski, Paul; Wincker, Patrick; Andrew, Marion; Anthouard, Véronique; Beever, Ross E; Beffa, Rolland; Benoit, Isabelle; Bouzid, Ourdia; Brault, Baptiste; Chen, Zehua; Choquer, Mathias; Collémare, Jérome; Cotton, Pascale; Danchin, Etienne G; Da Silva, Corinne; Gautier, Angélique; Giraud, Corinne; Giraud, Tatiana; Gonzalez, Celedonio; Grossetete, Sandrine; Güldener, Ulrich; Henrissat, Bernard; Howlett, Barbara J; Kodira, Chinnappa; Kretschmer, Matthias; Lappartient, Anne; Leroch, Michaela; Levis, Caroline; Mauceli, Evan; Neuvéglise, Cécile; Oeser, Birgitt; Pearson, Matthew; Poulain, Julie; Poussereau, Nathalie; Quesneville, Hadi; Rascle, Christine; Schumacher, Julia; Ségurens, Béatrice; Sexton, Adrienne; Silva, Evelyn; Sirven, Catherine; Soanes, Darren M; Talbot, Nicholas J; Templeton, Matt; Yandava, Chandri; Yarden, Oded; Zeng, Qiandong; Rollins, Jeffrey A; Lebrun, Marc-Henri; Dickman, Marty

    2011-08-01

    Sclerotinia sclerotiorum and Botrytis cinerea are closely related necrotrophic plant pathogenic fungi notable for their wide host ranges and environmental persistence. These attributes have made these species models for understanding the complexity of necrotrophic, broad host-range pathogenicity. Despite their similarities, the two species differ in mating behaviour and the ability to produce asexual spores. We have sequenced the genomes of one strain of S. sclerotiorum and two strains of B. cinerea. The comparative analysis of these genomes relative to one another and to other sequenced fungal genomes is provided here. Their 38-39 Mb genomes include 11,860-14,270 predicted genes, which share 83% amino acid identity on average between the two species. We have mapped the S. sclerotiorum assembly to 16 chromosomes and found large-scale co-linearity with the B. cinerea genomes. Seven percent of the S. sclerotiorum genome comprises transposable elements compared to <1% of B. cinerea. The arsenal of genes associated with necrotrophic processes is similar between the species, including genes involved in plant cell wall degradation and oxalic acid production. Analysis of secondary metabolism gene clusters revealed an expansion in number and diversity of B. cinerea-specific secondary metabolites relative to S. sclerotiorum. The potential diversity in secondary metabolism might be involved in adaptation to specific ecological niches. Comparative genome analysis revealed the basis of differing sexual mating compatibility systems between S. sclerotiorum and B. cinerea. The organization of the mating-type loci differs, and their structures provide evidence for the evolution of heterothallism from homothallism. These data shed light on the evolutionary and mechanistic bases of the genetically complex traits of necrotrophic pathogenicity and sexual mating. This resource should facilitate the functional studies designed to better understand what makes these fungi such successful

  8. A genome-wide 20 K citrus microarray for gene expression analysis

    Directory of Open Access Journals (Sweden)

    Gadea Jose

    2008-07-01

    Full Text Available Abstract Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database 1 was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global

  9. Genome inventory and analysis of nuclear hormone receptors in Tetraodon nigroviridis

    Indian Academy of Sciences (India)

    Raghu Prasad Rao Metpally; Ramakrishnan Vigneshwar; Ramanathan Sowdhamini

    2007-01-01

    Nuclear hormone receptors (NRs) form a large superfamily of ligand-activated transcription factors, which regulate genes underlying a wide range of (patho) physiological phenomena. Availability of the full genome sequence of Tetraodon nigroviridis facilitated a genome wide analysis of the NRs in fish genome. Seventy one NRs were found in Tetraodon and were compared with mammalian and fish NR family members. In general, there is a higher representation of NRs in fish genomes compared to mammalian ones. They showed high diversity across classes as observed by phylogenetic analysis. Nucleotide substitution rates show strong negative selection among fish NRs except for pregnane X receptor (PXR), estrogen receptor (ER) and liver X receptor (LXR). This may be attributed to crucial role played by them in metabolism and detoxification of xenobiotic and endobiotic compounds and might have resulted in slight positive selection. Chromosomal mapping and pairwise comparisons of NR distribution in Tetraodon and humans led to the identification of nine syntenic NR regions, of which three are common among fully sequenced vertebrate genomes. Gene structure analysis shows strong conservation of exon structures among orthologoues. Whereas paralogous members show different splicing patterns with intron gain or loss and addition or substitution of exons played a major role in evolution of NR superfamily.

  10. Teaching for Art Criticism: Incorporating Feldman's Critical Analysis Learning Model in Students' Studio Practice

    Science.gov (United States)

    Subramaniam, Maithreyi; Hanafi, Jaffri; Putih, Abu Talib

    2016-01-01

    This study adopted 30 first year graphic design students' artwork, with critical analysis using Feldman's model of art criticism. Data were analyzed quantitatively; descriptive statistical techniques were employed. The scores were viewed in the form of mean score and frequencies to determine students' performances in their critical ability.…

  11. Effect of Adjacent Structures on Foundation Response of Tower Building from SSI Analysis Incorporating Wave Incoherence

    International Nuclear Information System (INIS)

    Seismic response at foundation of large building caused by strong ground motion has tendency to be less intense than corresponding free-field motion, especially in high frequency range. To explain this phenomenon and to apply it to practical soil-structure interaction (SSI) analysis, concept of wave incoherence (or spatial variation) was introduced. The spatial variation of ground motion can be quantified by coherency function, and several coherency functions have been developed for engineering purpose. However, there is little investigation about their application to SSI analysis and design for buildings influenced by adjacent structures. This paper is focused on the seismic response of a building whose foundation lies between those of nearby structures. Specifically, a tower building consisting of steel and concrete is modeled, and the building is assumed to be located on rock media. Analyses are categorized into four cases according to the type of foundation and the existence of adjacent structures. For each case, the results from incoherent SSI analysis are compared with those from coherent analysis to investigate the effect on the seismic response of the building

  12. Analysis of Product Sampling for New Product Diffusion Incorporating Multiple-Unit Ownership

    Directory of Open Access Journals (Sweden)

    Zhineng Hu

    2014-01-01

    Full Text Available Multiple-unit ownership of nondurable products is an important component of sales in many product categories. Based on the Bass model, this paper develops a new model considering the multiple-unit adoptions as a diffusion process under the influence of product sampling. Though the analysis aims to determine the optimal dynamic sampling effort for a firm and the results demonstrate that experience sampling can accelerate the diffusion process, the best time to send free samples is just before the product being launched. Multiple-unit purchasing behavior can increase sales to make more profit for a firm, and it needs more samples to make the product known much better. The local sensitivity analysis shows that the increase of both external coefficients and internal coefficients has a negative influence on the sampling level, but the internal influence on the subsequent multiple-unit adoptions has little significant influence on the sampling. Using the logistic regression along with linear regression, the global sensitivity analysis gives a whole analysis of the interaction of all factors, which manifests the external influence and multiunit purchase rate are two most important factors to influence the sampling level and net present value of the new product, and presents a two-stage method to determine the sampling level.

  13. Classification Based on Hierarchical Linear Models: The Need for Incorporation of Social Contexts in Classification Analysis

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qui

    2009-01-01

    Many areas in educational and psychological research involve the use of classification statistical analysis. For example, school districts might be interested in attaining variables that provide optimal prediction of school dropouts. In psychology, a researcher might be interested in the classification of a subject into a particular psychological…

  14. Protein Analysis Using Real-Time PCR Instrumentation: Incorporation in an Integrated, Inquiry-Based Project

    Science.gov (United States)

    Southard, Jonathan N.

    2014-01-01

    Instrumentation for real-time PCR is used primarily for amplification and quantitation of nucleic acids. The capability to measure fluorescence while controlling temperature in multiple samples can also be applied to the analysis of proteins. Conformational stability and changes in stability due to ligand binding are easily assessed. Protein…

  15. Incorporating Asymmetric Dependency Patterns in the Evaluation of IS/IT projects Using Real Option Analysis

    Science.gov (United States)

    Burke, John C.

    2012-01-01

    The objective of my dissertation is to create a general approach to evaluating IS/IT projects using Real Option Analysis (ROA). This is an important problem because an IT Project Portfolio (ITPP) can represent hundreds of projects, millions of dollars of investment and hundreds of thousands of employee hours. Therefore, any advance in the…

  16. Comparative genomic analysis of Acidithiobacillus ferrooxidans strains using the A. ferrooxidans ATCC 23270 whole-genome oligonucleotide microarray.

    Science.gov (United States)

    Luo, Hailang; Shen, Li; Yin, Huaqun; Li, Qian; Chen, Qijiong; Luo, Yanjie; Liao, Liqin; Qiu, Guanzhou; Liu, Xueduan

    2009-05-01

    Acidithiobacillus ferrooxidans is an important microorganism used in biomining operations for metal recovery. Whole-genomic diversity analysis based on the oligonucleotide microarray was used to analyze the gene content of 12 strains of A. ferrooxidans purified from various mining areas in China. Among the 3100 open reading frames (ORFs) on the slides, 1235 ORFs were absent in at least 1 strain of bacteria and 1385 ORFs were conserved in all strains. The hybridization results showed that these strains were highly diverse from a genomic perspective. The hybridization results of 4 major functional gene categories, namely electron transport, carbon metabolism, extracellular polysaccharides, and detoxification, were analyzed. Based on the hybridization signals obtained, a phylogenetic tree was built to analyze the evolution of the 12 tested strains, which indicated that the geographic distribution was the main factor influencing the strain diversity of these strains. Based on the hybridization signals of genes associated with bioleaching, another phylogenetic tree showed an evolutionary relationship from which the co-relation between the clustering of specific genes and geochemistry could be observed. The results revealed that the main factor was geochemistry, among which the following 6 factors were the most important: pH, Mg, Cu, S, Fe, and Al. PMID:19483787

  17. Genome-wide comparative analysis reveals possible common ancestors of nucleotide-binding sites domain containing genes in hybrid Citrus sinensis genome and original Citrus clementina genome

    Science.gov (United States)

    We identified and re-annotated candidate disease resistance (R) genes with nucleotide-binding sites (NBS) domain from a Citrus clementina genome and two complete Citrus sinensis genome sequences (one from the USA and one from China). We found similar numbers of NBS genes from three citrus genomes, r...

  18. Analysis of the Legionella longbeachae genome and transcriptome uncovers unique strategies to cause Legionnaires' disease.

    Directory of Open Access Journals (Sweden)

    Christel Cazalet

    2010-02-01

    Full Text Available Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these

  19. A genomic background based method for association analysis in related individuals.

    Directory of Open Access Journals (Sweden)

    Najaf Amin

    Full Text Available BACKGROUND: Feasibility of genotyping of hundreds and thousands of single nucleotide polymorphisms (SNPs in thousands of study subjects have triggered the need for fast, powerful, and reliable methods for genome-wide association analysis. Here we consider a situation when study participants are genetically related (e.g. due to systematic sampling of families or because a study was performed in a genetically isolated population. Of the available methods that account for relatedness, the Measured Genotype (MG approach is considered the 'gold standard'. However, MG is not efficient with respect to time taken for the analysis of genome-wide data. In this context we proposed a fast two-step method called Genome-wide Association using Mixed Model and Regression (GRAMMAR for the analysis of pedigree-based quantitative traits. This method certainly overcomes the drawback of time limitation of the measured genotype (MG approach, but pays in power. One of the major drawbacks of both MG and GRAMMAR, is that they crucially depend on the availability of complete and correct pedigree data, which is rarely available. METHODOLOGY: In this study we first explore type 1 error and relative power of MG, GRAMMAR, and Genomic Control (GC approaches for genetic association analysis. Secondly, we propose an extension to GRAMMAR i.e. GRAMMAR-GC. Finally, we propose application of GRAMMAR-GC using the kinship matrix estimated through genomic marker data, instead of (possibly missing and/or incorrect genealogy. CONCLUSION: Through simulations we show that MG approach maintains high power across a range of heritabilities and possible pedigree structures, and always outperforms other contemporary methods. We also show that the power of our proposed GRAMMAR-GC approaches to that of the 'gold standard' MG for all models and pedigrees studied. We show that this method is both feasible and powerful and has correct type 1 error in the context of genome-wide association analysis

  20. Genome-wide analysis of Polycomb targets in Drosophila

    Energy Technology Data Exchange (ETDEWEB)

    Schwartz, Yuri B.; Kahn, Tatyana G.; Nix, David A.; Li,Xiao-Yong; Bourgon, Richard; Biggin, Mark; Pirrotta, Vincenzo

    2006-04-01

    Polycomb Group (PcG) complexes are multiprotein assemblages that bind to chromatin and establish chromatin states leading to epigenetic silencing. PcG proteins regulate homeotic genes in flies and vertebrates but little is known about other PcG targets and the role of the PcG in development, differentiation and disease. We have determined the distribution of the PcG proteins PC, E(Z) and PSC and of histone H3K27 trimethylation in the Drosophila genome. At more than 200 PcG target genes, binding sites for the three PcG proteins colocalize to presumptive Polycomb Response Elements (PREs). In contrast, H3 me3K27 forms broad domains including the entire transcription unit and regulatory regions. PcG targets are highly enriched in genes encoding transcription factors but receptors, signaling proteins, morphogens and regulators representing all major developmental pathways are also included.

  1. Comparative genomic analysis of human fungal pathogens causing paracoccidioidomycosis.

    Directory of Open Access Journals (Sweden)

    Christopher A Desjardins

    2011-10-01

    Full Text Available Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasiliensis (Pb03 and Pb18 and one strain of Paracoccidioides lutzii (Pb01. These genomes range in size from 29.1 Mb to 32.9 Mb and encode 7,610 to 8,130 genes. To enable genetic studies, we mapped 94% of the P. brasiliensis Pb18 assembly onto five chromosomes. We characterized gene family content across Onygenales and related fungi, and within Paracoccidioides we found expansions of the fungal-specific kinase family FunK1. Additionally, the Onygenales have lost many genes involved in carbohydrate metabolism and fewer genes involved in protein metabolism, resulting in a higher ratio of proteases to carbohydrate active enzymes in the Onygenales than their relatives. To determine if gene content correlated with growth on different substrates, we screened the non-pathogenic onygenale Uncinocarpus reesii, which has orthologs for 91% of Paracoccidioides metabolic genes, for growth on 190 carbon sources. U. reesii showed growth on a limited range of carbohydrates, primarily basic plant sugars and cell wall components; this suggests that Onygenales, including dimorphic fungi, can degrade cellulosic plant material in the soil. In addition, U. reesii grew on gelatin and a wide range of dipeptides and amino acids, indicating a preference for proteinaceous growth substrates over carbohydrates, which may enable these fungi to also degrade animal biomass. These capabilities for degrading plant and animal substrates suggest a duality in lifestyle that could enable pathogenic

  2. Digital gene expression analysis of the zebra finch genome

    Directory of Open Access Journals (Sweden)

    Burke Terry

    2010-04-01

    Full Text Available Abstract Background In order to understand patterns of adaptation and molecular evolution it is important to quantify both variation in gene expression and nucleotide sequence divergence. Gene expression profiling in non-model organisms has recently been facilitated by the advent of massively parallel sequencing technology. Here we investigate tissue specific gene expression patterns in the zebra finch (Taeniopygia guttata with special emphasis on the genes of the major histocompatibility complex (MHC. Results Almost 2 million 454-sequencing reads from cDNA of six different tissues were assembled and analysed. A total of 11,793 zebra finch transcripts were represented in this EST data, indicating a transcriptome coverage of about 65%. There was a positive correlation between the tissue specificity of gene expression and non-synonymous to synonymous nucleotide substitution ratio of genes, suggesting that genes with a specialised function are evolving at a higher rate (or with less constraint than genes with a more general function. In line with this, there was also a negative correlation between overall expression levels and expression specificity of contigs. We found evidence for expression of 10 different genes related to the MHC. MHC genes showed relatively tissue specific expression levels and were in general primarily expressed in spleen. Several MHC genes, including MHC class I also showed expression in brain. Furthermore, for all genes with highest levels of expression in spleen there was an overrepresentation of several gene ontology terms related to immune function. Conclusions Our study highlights the usefulness of next-generation sequence data for quantifying gene expression in the genome as a whole as well as in specific candidate genes. Overall, the data show predicted patterns of gene expression profiles and molecular evolution in the zebra finch genome. Expression of MHC genes in particular, corresponds well with expression

  3. Psittacid Herpesvirus 1 and Infectious Laryngotracheitis Virus: Comparative Genome Sequence Analysis of Two Avian Alphaherpesviruses

    Science.gov (United States)

    Thureen, Dean R.; Keeler, Calvin L.

    2006-01-01

    Psittacid herpesvirus 1 (PsHV-1) is the causative agent of Pacheco's disease, an acute, highly contagious, and potentially lethal respiratory herpesvirus infection in psittacine birds, while infectious laryngotracheitis virus (ILTV) is a highly contagious and economically significant avian herpesvirus which is responsible for an acute respiratory disease limited to galliform birds. The complete genome sequence of PsHV-1 has been determined and compared to the ILTV sequence, assembled from published data. The PsHV-1 and ILTV genomes exhibit similar structural characteristics and are 163,025 bp and 148,665 bp in length, respectively. The PsHV-1 genome contains 73 predicted open reading frames (ORFs), while the ILTV genome contains 77 predicted ORFs. Both genomes contain an inversion in the unique long region similar to that observed in pseudorabies virus. PsHV-1 is closely related to ILTV, and it is proposed that it be assigned to the Iltovirus genus. These two avian herpesviruses represent a phylogenetically unique clade of alphaherpesviruses that are distinct from the Marek's disease-like viruses (Mardivirus). The determination of the complete genomic nucleotide sequences of PsHV-1 and ILTV provides a tool for further comparative and functional analysis of this unique class of avian alphaherpesviruses. PMID:16873243

  4. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

    Directory of Open Access Journals (Sweden)

    Emmanouil A Trantas

    2015-08-01

    Full Text Available The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor and P. mediterranea (Pmed, are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for commercially significant chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of a type III secretion system and of known type III effectors from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes.

  5. Analysis of transposable elements in the genome of Asparagus officinalis from high coverage sequence data.

    Directory of Open Access Journals (Sweden)

    Shu-Fen Li

    Full Text Available Asparagus officinalis is an economically and nutritionally important vegetable crop that is widely cultivated and is used as a model dioecious species to study plant sex determination and sex chromosome evolution. To improve our understanding of its genome composition, especially with respect to transposable elements (TEs, which make up the majority of the genome, we performed Illumina HiSeq2000 sequencing of both male and female asparagus genomes followed by bioinformatics analysis. We generated 17 Gb of sequence (12×coverage and assembled them into 163,406 scaffolds with a total cumulated length of 400 Mbp, which represent about 30% of asparagus genome. Overall, TEs masked about 53% of the A. officinalis assembly. Majority of the identified TEs belonged to LTR retrotransposons, which constitute about 28% of genomic DNA, with Ty1/copia elements being more diverse and accumulated to higher copy numbers than Ty3/gypsy. Compared with LTR retrotransposons, non-LTR retrotransposons and DNA transposons were relatively rare. In addition, comparison of the abundance of the TE groups between male and female genomes showed that the overall TE composition was highly similar, with only slight differences in the abundance of several TE groups, which is consistent with the relatively recent origin of asparagus sex chromosomes. This study greatly improves our knowledge of the repetitive sequence construction of asparagus, which facilitates the identification of TEs responsible for the early evolution of plant sex chromosomes and is helpful for further studies on this dioecious plant.

  6. Analysis of transposable elements in the genome of Asparagus officinalis from high coverage sequence data.

    Science.gov (United States)

    Li, Shu-Fen; Gao, Wu-Jun; Zhao, Xin-Peng; Dong, Tian-Yu; Deng, Chuan-Liang; Lu, Long-Dou

    2014-01-01

    Asparagus officinalis is an economically and nutritionally important vegetable crop that is widely cultivated and is used as a model dioecious species to study plant sex determination and sex chromosome evolution. To improve our understanding of its genome composition, especially with respect to transposable elements (TEs), which make up the majority of the genome, we performed Illumina HiSeq2000 sequencing of both male and female asparagus genomes followed by bioinformatics analysis. We generated 17 Gb of sequence (12×coverage) and assembled them into 163,406 scaffolds with a total cumulated length of 400 Mbp, which represent about 30% of asparagus genome. Overall, TEs masked about 53% of the A. officinalis assembly. Majority of the identified TEs belonged to LTR retrotransposons, which constitute about 28% of genomic DNA, with Ty1/copia elements being more diverse and accumulated to higher copy numbers than Ty3/gypsy. Compared with LTR retrotransposons, non-LTR retrotransposons and DNA transposons were relatively rare. In addition, comparison of the abundance of the TE groups between male and female genomes showed that the overall TE composition was highly similar, with only slight differences in the abundance of several TE groups, which is consistent with the relatively recent origin of asparagus sex chromosomes. This study greatly improves our knowledge of the repetitive sequence construction of asparagus, which facilitates the identification of TEs responsible for the early evolution of plant sex chromosomes and is helpful for further studies on this dioecious plant. PMID:24810432

  7. Genomic analysis of natural selection and phenotypic variation in high-altitude mongolians.

    Directory of Open Access Journals (Sweden)

    Jinchuan Xing

    Full Text Available Deedu (DU Mongolians, who migrated from the Mongolian steppes to the Qinghai-Tibetan Plateau approximately 500 years ago, are challenged by environmental conditions similar to native Tibetan highlanders. Identification of adaptive genetic factors in this population could provide insight into coordinated physiological responses to this environment. Here we examine genomic and phenotypic variation in this unique population and present the first complete analysis of a Mongolian whole-genome sequence. High-density SNP array data demonstrate that DU Mongolians share genetic ancestry with other Mongolian as well as Tibetan populations, specifically in genomic regions related with adaptation to high altitude. Several selection candidate genes identified in DU Mongolians are shared with other Asian groups (e.g., EDAR, neighboring Tibetan populations (including high-altitude candidates EPAS1, PKLR, and CYP2E1, as well as genes previously hypothesized to be associated with metabolic adaptation (e.g., PPARG. Hemoglobin concentration, a trait associated with high-altitude adaptation in Tibetans, is at an intermediate level in DU Mongolians compared to Tibetans and Han Chinese at comparable altitude. Whole-genome sequence from a DU Mongolian (Tianjiao1 shows that about 2% of the genomic variants, including more than 300 protein-coding changes, are specific to this individual. Our analyses of DU Mongolians and the first Mongolian genome provide valuable insight into genetic adaptation to extreme environments.

  8. Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary U.S. Holstein cows

    Science.gov (United States)

    Background Genome-wide association analysis is a powerful tool for annotating phenotypic effects on the genome and knowledge of genes and chromosomal regions associated with dairy phenotypes is useful for genome and gene-based selection. Here, we report results of a genome-wide analysis of predicted...

  9. Forensic analysis of Windows' virtual memory incorporating the system's page-file

    OpenAIRE

    Stimson, Jared M.

    2008-01-01

    Computer Forensics is concerned with the use of computer investigation and analysis techniques in order to collect evidence suitable for presentation in court. The examination of volatile memory is a relatively new but important area in computer forensics. More recently criminals are becoming more forensically aware and are now able to compromise computers without accessing the hard disk of the target computer. This means that traditional incident response practice of pulling the plug wil...

  10. Analysis of Product Sampling for New Product Diffusion Incorporating Multiple-Unit Ownership

    OpenAIRE

    Zhineng Hu; Yurong Pei; Ruikun Xie

    2014-01-01

    Multiple-unit ownership of nondurable products is an important component of sales in many product categories. Based on the Bass model, this paper develops a new model considering the multiple-unit adoptions as a diffusion process under the influence of product sampling. Though the analysis aims to determine the optimal dynamic sampling effort for a firm and the results demonstrate that experience sampling can accelerate the diffusion process, the best time to send free samples is just before ...

  11. Acoustic Analysis and Speech Intelligibility in Patients Wearing Conventional Dentures and Rugae Incorporated Dentures

    OpenAIRE

    Adaki, Raghavendra; Meshram, Suresh; Adaki, Shridevi

    2013-01-01

    Phonetics is an important function of oral cavity. It has been overlooked quite frequently while fabricating the complete dentures. In this study modification of anterior palatal surface of denture is done and assessed for its impact on phonetics. Purpose is to assess acoustic and speech intelligibility analysis in edentulous patients and also to evaluate the influence of conventional dentures, arbitrary rugae and customized rugae dentures on speech in complete denture wearers. Ten healthy ed...

  12. Direct voltammetric analysis of DNA modified with enzymatically incorporated 7-deazapurines

    Czech Academy of Sciences Publication Activity Database

    Pivoňková, Hana; Horáková Brázdilová, Petra; Fojtová, Miloslava; Fojta, Miroslav

    2010-01-01

    Roč. 82, č. 16 (2010), s. 6807-6813. ISSN 0003-2700 R&D Projects: GA AV ČR(CZ) IAA400040901; GA MŠk(CZ) LC06035 Institutional research plan: CEZ:AV0Z50040507; CEZ:AV0Z50040702 Keywords : modified DNA * 7-deazapurines * voltammetric analysis Subject RIV: BO - Biophysics Impact factor: 5.874, year: 2010

  13. Pan-Genome Analysis of Human Gastric Pathogen H. pylori: Comparative Genomics and Pathogenomics Approaches to Identify Regions Associated with Pathogenicity and Prediction of Potential Core Therapeutic Targets

    DEFF Research Database (Denmark)

    Ali, Amjad; Naz, Anam; Soares, Siomar C.;

    2015-01-01

    -genome approach; the predicted conserved gene families (1,193) constitute similar to 77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost...... homolog proteins were characterized as universal therapeutic targets against H. pylori based on their functional annotation and protein-protein interaction. Finally, pathogenomics and genome plasticity analysis revealed 3 highly conserved and 2 highly variable putative pathogenicity islands in all...

  14. Practical methods for incorporating summary time-to-event data into meta-analysis

    Directory of Open Access Journals (Sweden)

    Burdett Sarah

    2007-06-01

    Full Text Available Abstract Background In systematic reviews and meta-analyses, time-to-event outcomes are most appropriately analysed using hazard ratios (HRs. In the absence of individual patient data (IPD, methods are available to obtain HRs and/or associated statistics by carefully manipulating published or other summary data. Awareness and adoption of these methods is somewhat limited, perhaps because they are published in the statistical literature using statistical notation. Methods This paper aims to 'translate' the methods for estimating a HR and associated statistics from published time-to-event-analyses into less statistical and more practical guidance and provide a corresponding, easy-to-use calculations spreadsheet, to facilitate the computational aspects. Results A wider audience should be able to understand published time-to-event data in individual trial reports and use it more appropriately in meta-analysis. When faced with particular circumstances, readers can refer to the relevant sections of the paper. The spreadsheet can be used to assist them in carrying out the calculations. Conclusion The methods cannot circumvent the potential biases associated with relying on published data for systematic reviews and meta-analysis. However, this practical guide should improve the quality of the analysis and subsequent interpretation of systematic reviews and meta-analyses that include time-to-event outcomes.

  15. Identification and Characterization of Microsatellite Markers Derived from the Whole Genome Analysis of Taenia solium.

    Directory of Open Access Journals (Sweden)

    Mónica J Pajuelo

    2015-12-01

    Full Text Available Infections with Taenia solium are the most common cause of adult acquired seizures worldwide, and are the leading cause of epilepsy in developing countries. A better understanding of the genetic diversity of T. solium will improve parasite diagnostics and transmission pathways in endemic areas thereby facilitating the design of future control measures and interventions. Microsatellite markers are useful genome features, which enable strain typing and identification in complex pathogen genomes. Here we describe microsatellite identification and characterization in T. solium, providing information that will assist in global efforts to control this important pathogen.For genome sequencing, T. solium cysts and proglottids were collected from Huancayo and Puno in Peru, respectively. Using next generation sequencing (NGS and de novo assembly, we assembled two draft genomes and one hybrid genome. Microsatellite sequences were identified and 36 of them were selected for further analysis. Twenty T. solium isolates were collected from Tumbes in the northern region, and twenty from Puno in the southern region of Peru. The size-polymorphism of the selected microsatellites was determined with multi-capillary electrophoresis. We analyzed the association between microsatellite polymorphism and the geographic origin of the samples.The predicted size of the hybrid (proglottid genome combined with cyst genome T. solium genome was 111 MB with a GC content of 42.54%. A total of 7,979 contigs (>1,000 nt were obtained. We identified 9,129 microsatellites in the Puno-proglottid genome and 9,936 in the Huancayo-cyst genome, with 5 or more repeats, ranging from mono- to hexa-nucleotide. Seven microsatellites were polymorphic and 29 were monomorphic within the analyzed isolates. T. solium tapeworms were classified into two genetic groups that correlated with the North/South geographic origin of the parasites.The availability of draft genomes for T. solium represents a

  16. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    Directory of Open Access Journals (Sweden)

    Yunsheng Wang

    Full Text Available In this study, we identified and compared nucleotide-binding site (NBS domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China. Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  17. Comparative genome analysis of the high pathogenicity Salmonella Typhimurium strain UK-1.

    Directory of Open Access Journals (Sweden)

    Yingqin Luo

    Full Text Available Salmonella enterica serovar Typhimurium, a gram-negative facultative rod-shaped bacterium causing salmonellosis and foodborne disease, is one of the most common isolated Salmonella serovars in both developed and developing nations. Several S. Typhimurium genomes have been completed and many more genome-sequencing projects are underway. Comparative genome analysis of the multiple strains leads to a better understanding of the evolution of S. Typhimurium and its pathogenesis. S. Typhimurium strain UK-1 (belongs to phage type 1 is highly virulent when orally administered to mice and chickens and efficiently colonizes lymphoid tissues of these species. These characteristics make this strain a good choice for use in vaccine development. In fact, UK-1 has been used as the parent strain for a number of nonrecombinant and recombinant vaccine strains, including several commercial vaccines for poultry. In this study, we conducted a thorough comparative genome analysis of the UK-1 strain with other S. Typhimurium strains and examined the phenotypic impact of several genomic differences. Whole genomic comparison highlights an extremely close relationship between the UK-1 strain and other S. Typhimurium strains; however, many interesting genetic and genomic variations specific to UK-1 were explored. In particular, the deletion of a UK-1-specific gene that is highly similar to the gene encoding the T3SS effector protein NleC exhibited a significant decrease in oral virulence in BALB/c mice. The complete genetic complements in UK-1, especially those elements that contribute to virulence or aid in determining the diversity within bacterial species, provide key information in evaluating the functional characterization of important genetic determinants and for development of vaccines.

  18. [Incorporation of the Hazard Analysis and Critical Control Point system (HACCP) in food legislation].

    Science.gov (United States)

    Castellanos Rey, Liliana C; Villamil Jiménez, Luis C; Romero Prada, Jaime R

    2004-01-01

    The Hazard Analysis and Critical Control Point system (HACCP), recommended by different international organizations as the Codex Alimentarius Commission, the World Trade Organization (WTO), the International Office of Epizootics (OIE) and the International Convention for Vegetables Protection (ICPV) amongst others, contributes to ensuring the innocuity of food along the agro-alimentary chain and requires of Good Manufacturing Practices (GMP) for its implementation, GMP's which are legislated in most countries. Since 1997, Colombia has set rules and legislation for application of HACCP system in agreement with international standards. This paper discusses the potential and difficulties of the legislation enforcement and suggests some policy implications towards food safety. PMID:15656068

  19. Thermoeconomic analysis incorporating the concept of ecological efficiency; Analise termoeconomica incorporando o conceito de eficiencia ecologica

    Energy Technology Data Exchange (ETDEWEB)

    Villela, I.A.C. [University of Sao Paulo (EEL/USP), Lorena, SP (Brazil). Coll. of Engineering. Dept. of Environment Science ], Email: iraides@debas.eel.usp.br; Silveira, J.L. [Universidade Estadual Paulista (UNESP), Guaratingueta, SP (Brazil). Dept. of Energy], Email: joseluz@feg.unesp.br

    2009-07-01

    A comparative analysis of the pollution resulting from the natural gas combustion for a thermoelectric power plant (230 MW) by utilizing the combined cycle (CC) and recovering kettle, with no burning and with fuel complementary burning. Initially the CO{sub 2}, SO{sub 2}, NO{sub x} and Particulate Matter emission levels are determined. Later, the thermoelectric power plant environmental impact is evaluated through the utilization of a methodology based on the ecological efficiency ({epsilon}), parameter that integrates in a single coefficient the aspects that define the environmental impact intensity, with basis on the fuel utilized, combustion technology, pollution index and power plant thermodynamic efficiency. The objective is to apply the concept of ecological efficiency in a thermoeconomic analysis method which utilizes function diagram and allows the estimation of the electricity production cost. It is concluded that the use of a system with no complementary burning is better than the one with complementary burning, both from the ecological and the economical points of view. (author)

  20. Nonlinear Dynamic Analysis of Multi-component Mooring Lines Incorporating Line-seabed Interaction

    Directory of Open Access Journals (Sweden)

    V.J. Kurian

    2013-07-01

    Full Text Available In this study, a deterministic approach for the dynamic analysis of a multi-component mooring line was formulated. The floater motion responses were considered as the mooring line upper boundary conditions while the anchored point was considered as pinned. Lumped parameter approach was adopted for the mooring line modelling. The forces considered were the submerged weights of mooring/attachment, physical/added inertia, line tension, fluid/line relative drag forces and line/seabed reactive forces. The latter interactions were modelled assuming that the mooring line rested on an elastic dissipative foundation. An iterative procedure for the dynamic analysis was developed and results for various mooring lines partially lying on different soils were obtained and validated by conducting a comparative study against published results. Good agreement between numerical and published experimental results was achieved. The contribution of the soil characteristics of the seabed to the dynamic behaviour of mooring line was investigated for different types of soil and reported.

  1. Incorporation of Passive Microwave Brightness Temperatures in the ECMWF Soil Moisture Analysis

    Directory of Open Access Journals (Sweden)

    Joaquín Muñoz-Sabater

    2015-05-01

    Full Text Available For more than a decade, the European Centre for Medium-Range Weather Forecasts (ECMWF has used in-situ observations of 2 m temperature and 2 m relative humidity to operationally constrain the temporal evolution of model soil moisture. These observations are not available everywhere and they are indirectly linked to the state of the surface, so under various circumstances, such as weak radiative forcing or strong advection, they cannot be used as a proxy for soil moisture reinitialization in numerical weather prediction. Recently, the ECMWF soil moisture analysis has been updated to be able to account for the information provided by microwave brightness temperatures from the Soil Moisture and Ocean Salinity (SMOS mission of the European Space Agency (ESA. This is the first time that ECMWF uses direct information of the soil emission from passive microwave data to globally adjust the estimation of soil moisture by a land-surface model. This paper presents a novel version of the ECMWF Extended Kalman Filter soil moisture analysis to account for remotely sensed passive microwave data. It also discusses the advantages of assimilating direct satellite radiances compared to current soil moisture products, with a view to an operational implementation. A simple assimilation case study at global scale highlights the potential benefits and obstacles of using this new type of information in a global coupled land-atmospheric model.

  2. Statistical analysis of surrogate signals to incorporate respiratory motion variability into radiotherapy treatment planning

    Science.gov (United States)

    Wilms, Matthias; Ehrhardt, Jan; Werner, René; Marx, Mirko; Handels, Heinz

    2014-03-01

    Respiratory motion and its variability lead to location uncertainties in radiation therapy (RT) of thoracic and abdominal tumors. Current approaches for motion compensation in RT are usually driven by respiratory surrogate signals, e.g., spirometry. In this contribution, we present an approach for statistical analysis, modeling and subsequent simulation of surrogate signals on a cycle-by-cycle basis. The simulated signals represent typical patient-specific variations of, e.g., breathing amplitude and cycle period. For the underlying statistical analysis, all breathing cycles of an observed signal are consistently parameterized using approximating B-spline curves. Statistics on breathing cycles are then performed by using the parameters of the B-spline approximations. Assuming that these parameters follow a multivariate Gaussian distribution, realistic time-continuous surrogate signals of arbitrary length can be generated and used to simulate the internal motion of tumors and organs based on a patient-specific diffeomorphic correspondence model. As an example, we show how this approach can be employed in RT treatment planning to calculate tumor appearance probabilities and to statistically assess the impact of respiratory motion and its variability on planned dose distributions.

  3. Systems analysis of common cause events and the incorporation of experience data

    International Nuclear Information System (INIS)

    The objectives of this project are to effectively manage risk and to enhance system reliability performance with respect to dependent events. In support of these objectives, it is necessary to be able to quantify the reliability performance of current systems. There are two reasons for this. The first is that before the authors can decide whether it is necessary to do anything about many dependent events, they need to know what their impact is on system performance. The second is that priorities and resources to be allocated for risk management of dependent events are more objectively established when the risk benefit can be quantified, and this requires the quantification of system reliability characteristics. Therefore, a very important use of this research on common cause failure analysis is to support the treatment of dependent events in applied risk and reliability evaluation. The purpose of these notes is to illustrate how information can be extracted from a dependent events data base in support of system reliability analysis of the type performed in risk assessment studies

  4. Genome analysis of rice-blast fungus Magnaporthe oryzae field isolates from southern India

    Directory of Open Access Journals (Sweden)

    Malali Gowda

    2015-09-01

    Full Text Available The Indian subcontinent is the center of origin and diversity for rice (Oryza sativa L.. The O. sativa ssp. indica is a major food crop grown in India, which occupies the first and second position in area and production, respectively. Blast disease caused by Magnaporthe oryzae is a major constraint to rice production. Here, we report the analysis of genome architecture and sequence variation of two field isolates, B157 and MG01, of the blast fungus from southern India. The 40 Mb genome of B157 and 43 Mb genome of MG01 contained 11,344 and 11,733 predicted genes, respectively. Genomic comparisons unveiled a large set of SNPs and several isolate specific genes in the Indian blast isolates. Avr genes were analyzed in several sequenced Magnaporthe strains; this analysis revealed the presence of Avr-Pizt and Avr-Ace1 genes in all the sequenced isolates. Availability of whole genomes of field isolates from India will contribute to global efforts to understand genetic diversity of M. oryzae population and to track the emergence of virulent pathotypes.

  5. Integrated analysis of genome-wide genetic and epigenetic association data for identification of disease mechanisms.

    Science.gov (United States)

    Ke, Xiayi; Cortina-Borja, Mario; Silva, Bruno Cesar; Lowe, Robert; Rakyan, Vardhman; Balding, David

    2013-11-01

    Many human diseases are multifactorial, involving multiple genetic and environmental factors impacting on one or more biological pathways. Much of the environmental effect is believed to be mediated through epigenetic changes. Although many genome-wide genetic and epigenetic association studies have been conducted for different diseases and traits, it is still far from clear to what extent the genomic loci and biological pathways identified in the genetic and epigenetic studies are shared. There is also a lack of statistical tools to assess these important aspects of disease mechanisms. In the present study, we describe a protocol for the integrated analysis of genome-wide genetic and epigenetic data based on permutation of a sum statistic for the combined effects in a locus or pathway. The method was then applied to published type 1 diabetes (T1D) genome-wide- and epigenome-wide-association studies data to identify genomic loci and biological pathways that are associated with T1D genetically and epigenetically. Through combined analysis, novel loci and pathways were also identified, which could add to our understanding of disease mechanisms of T1D as well as complex diseases in general. PMID:24071862

  6. Genome-wide analysis of the WRKY transcription factors in aegilops tauschii.

    Science.gov (United States)

    Ma, Jianhui; Zhang, Daijing; Shao, Yun; Liu, Pei; Jiang, Lina; Li, Chunxi

    2014-01-01

    The WRKY transcription factors (TFs) play important roles in responding to abiotic and biotic stress in plants. However, due to its unfinished genome sequencing, relatively few WRKY TFs with full-length coding sequences (CDSs) have been identified in wheat. Instead, the Aegilops tauschii genome, which is the D-genome progenitor of the hexaploid wheat genome, provides important resources for the discovery of new genes. In this study, we performed a bioinformatics analysis to identify WRKY TFs with full-length CDSs from the A. tauschii genome. A detailed evolutionary analysis for all these TFs was conducted, and quantitative real-time PCR was carried out to investigate the expression patterns of the abiotic stress-related WRKY TFs under different abiotic stress conditions in A. tauschii seedlings. A total of 93 WRKY TFs were identified from A. tauschii, and 79 of them were found to be newly discovered genes compared with wheat. Gene phylogeny, gene structure and chromosome location of the 93 WRKY TFs were fully analyzed. These studies provide a global view of the WRKY TFs from A. tauschii and a firm foundation for further investigations in both A. tauschii and wheat. PMID:25592959

  7. Genomic Analysis of Pathogenicity Determinants in Mycobacterium kansasii Type I

    KAUST Repository

    Guan, Qingtian

    2016-05-01

    Mycobacteria, a genus within Actinobacteria Phylum, are well known for two pathogens that cause human diseases: leprosy and tuberculosis. Other than the obligate human mycobacteria, there is a group of bacteria that are present in the environment and occasionally cause diseases in immunocompromised persons: the non-tuberculosis mycobacteria (NTM). Mycobacterium kansasii, which was first discovered in the Kansas state, is the main etiologic agent responsible for lung infections caused by NTM and raises attention because of its co-infection with human immunodeficiency virus (HIV). Five subspecies of M. kansasii (Type I-V) were described and only M. kansasii Type I is pathogenic to humans. M. kansasii is a Gram-positive bacteria that has a unique cell wall and secretion system, which is essential for its pathogenicity. We undertook a comparative genomics and transcriptomic approach to identify components of M. kansasii Type I pathogenicity. Our previous study showed that espA (ESX-1 essential protein) operon, a major component of the secretion system, is exclusively present in M. kansasii Type I. The purpose of this study was to test the functional role of the espA operon in pathogenicity and identify other components that may also be involved in pathogenicity. This study provides a new molecular diagnostic method for M. kansasii Type I infection using PCR (Polymerase Chain Reaction) technique to target the espAoperon. With detailed manual curation of the comparative genomics datasets, we found several genes exclusively present in M. kansasii Type I including ppsA/ppsC and whiB6, that we believe are involved, or have an effect on ESX-mediated secretion system. We have also highlighted, in our study, the differences in genetic components coding for the cell membrane composition between the five subspecies of M. kansasii. These results shed light on genetic components that are responsible for pathogenicity determinants in Type I M. kansasii and may help to design better

  8. Multi-scale finite element analysis of chloride diffusion in concrete incorporating paste/aggregate ITZs

    Science.gov (United States)

    Guo, Li; Guo, XiaoMing; Mi, ChangWen

    2012-09-01

    In this paper, we propose a concurrent multi-scale finite element (FE) model coupling equations of the degree of freedoms of meso-scale model of ITZs and macroscopic model of bulk pastes. The multi-scale model is subsequently implemented and integrated into ABAQUS resulting in easy application to complex concrete structures. A few benchmark numerical examples are performed to test both the accuracy and efficiency of the developed model in analyzing chloride diffusion in concrete. These examples clearly demonstrate that high diffusivity of ITZs, primarily because of its porous microstructure, tends to accelerate chloride penetration along concentration gradient. The proposed model provides new guidelines for the durability analysis of concrete structures under adverse operating conditions.

  9. Incorporating risk analysis in the economic evaluation of a typical western Canadian horizontal well project

    International Nuclear Information System (INIS)

    The determination of profitability indicators for a horizontal well drilled in a developed field depends on several parameters in which there are uncertainties. A risk analysis approach is presented for handling these uncertainties, considering a typical light-medium carbonate pool. Parameters with the greatest uncertainties are drainage area, net pay, saturations, porosity, horizontal and vertical permeabilities, reservoir pressure, in-situ viscosity, effective horizontal well length, capital and operating costs, and crude oil and product prices. Parameters are grouped into three general categories: oil-in-place calculation, horizontal well productivity, and economic parameters. A stochastic approach is used to establish a distribution for the profitability indicators. 8 refs., 15 figs., 2 tabs

  10. Incorporating natural capital into economy-wide impact analysis: a case study from Alberta.

    Science.gov (United States)

    Patriquin, Mike N; Alavalapati, Janaki R R; Adamowicz, Wiktor L; White, William A

    2003-01-01

    Traditionally, decision-makers have relied on economic impact estimates derived from conventional economy-wide models. Conventional models lack the environmental linkages necessary for examining environmental stewardship and economic sustainability, and in particular the ability to assess the impact of policies on natural capital. This study investigates environmentally extended economic impact estimation on a regional scale using a case study region in the province of Alberta known as the Foothills Model Forest (FMF). Conventional economic impact models are environmentally extended in pursuit of enhancing policy analysis and local decision-making. It is found that the flexibility of the computable general equilibrium (CGE) modeling approach offers potential for environmental extension, with a solid grounding in economic theory. The CGE approach may be the tool of the future for more complete integrated environment and economic impact assessment. PMID:12859004

  11. Genome-wide meta-analysis identifies multiple novel associations and ethnic heterogeneity of psoriasis susceptibility

    NARCIS (Netherlands)

    Yin, Xianyong; Low, Hui Qi; Wang, Ling; Li, Yonghong; Ellinghaus, Eva; Han, Jiali; Estivill, Xavier; Sun, Liangdan; Zuo, Xianbo; Shen, Changbing; Zhu, Caihong; Zhang, Anping; Sanchez, Fabio; Padyukov, Leonid; Catanese, Joseph J; Krueger, Gerald G; Duffin, Kristina Callis; Mucha, Sören; Weichenthal, Michael; Weidinger, Stephan; Lieb, Wolfgang; Foo, Jia Nee; Li, Yi; Sim, Karseng; Liany, Herty; Irwan, Ishak; Teo, Yikying; Theng, Colin T S; Gupta, Rashmi; Bowcock, Anne; De Jager, Philip L; Qureshi, Abrar A; de Bakker, Paul I W; Seielstad, Mark; Liao, Wilson; Ståhle, Mona; Franke, Andre; Zhang, Xuejun; Liu, Jianjun

    2015-01-01

    Psoriasis is a common inflammatory skin disease with complex genetics and different degrees of prevalence across ethnic populations. Here we present the largest trans-ethnic genome-wide meta-analysis (GWMA) of psoriasis in 15,369 cases and 19,517 controls of Caucasian and Chinese ancestries. We iden

  12. Genome-wide association analysis in primary sclerosing cholangitis identifies two non-HLA susceptibility loci

    NARCIS (Netherlands)

    E. Melum; A. Franke; C. Schramm; T.J. Weismüller; D.N. Gotthardt; F.A. Offner; B.D. Juran; J.K. Laerdahl; V. Labi; E. Björnsson; R.K. Weersma; L. Henckaerts; A. Teufel; C. Rust; E. Ellinghaus; T. Balschun; K.M. Boberg; D. Ellinghaus; A. Bergquist; P. Sauer; E. Ryu; J.R. Hov; J. Wedemeyer; B. Lindkvist; M. Wittig; R.J. Porte; K. Holm; C. Gieger; H.E. Wichmann; P. Stokkers; C.Y. Ponsioen; H. Runz; A. Stiehl; C. Wijmenga; M. Sterneck; S. Vermeire; U. Beuers; A. Villunger; E. Schrumpf; K.N. Lazaridis; M.P. Manns; S. Schreiber; T.H. Karlsen

    2011-01-01

    Primary sclerosing cholangitis (PSC) is a chronic bile duct disease affecting 2.4-7.5% of individuals with inflammatory bowel disease. We performed a genome-wide association analysis of 2,466,182 SNPs in 715 individuals with PSC and 2,962 controls, followed by replication in 1,025 PSC cases and 2,17

  13. Genome-wide meta-analysis identifies six novel loci associated with habitual coffee consumption

    NARCIS (Netherlands)

    Cornelis, M. C.; Byrne, E. M.; Esko, T.; Nalls, M. A.; Ganna, A.; Paynter, N.; Monda, K. L.; Amin, N.; Fischer, K.; Renstrom, F.; Ngwa, J. S.; Huikari, V.; Cavadino, A.; Nolte, I. M.; Teumer, A.; Yu, K.; Marques-Vidal, P.; Rawal, R.; Manichaikul, A.; Wojczynski, M. K.; Vink, J. M.; Zhao, J. H.; Burlutsky, G.; Lahti, J.; Mikkila, V.; Lemaitre, R. N.; Eriksson, J.; Musani, S. K.; Tanaka, T.; Geller, F.; Luan, J.; Hui, J.; Maegi, R.; Dimitriou, M.; Garcia, M. E.; Ho, W-K; Wright, M. J.; Rose, L. M.; Magnusson, P. K. E.; Pedersen, N. L.; Couper, D.; Oostra, B. A.; Hofman, A.; Ikram, M. A.; Tiemeier, H. W.; Uitterlinden, A. G.; van Rooij, F. J. A.; Barroso, I.; Johansson, I.; Xue, L.; Kaakinen, M.; Milani, L.; Power, C.; Snieder, H.; Stolk, R. P.; Baumeister, S. E.; Biffar, R.; Gu, F.; Bastardot, F.; Kutalik, Z.; Jacobs, D. R.; Forouhi, N. G.; Mihailov, E.; Lind, L.; Lindgren, C.; Michaelsson, K.; Morris, A.; Jensen, M.; Khaw, K-T; Luben, R. N.; Wang, J. J.; Mannisto, S.; Perala, M-M; Kahonen, M.; Lehtimaki, T.; Viikari, J.; Mozaffarian, D.; Mukamal, K.; Psaty, B. M.; Doering, A.; Heath, A. C.; Montgomery, G. W.; Dahmen, N.; Carithers, T.; Tucker, K. L.; Ferrucci, L.; Boyd, H. A.; Melbye, M.; Treur, J. L.; Mellstrom, D.; Hottenga, J. J.; Prokopenko, I.; Toenjes, A.; Deloukas, P.; Kanoni, S.; Lorentzon, M.; Houston, D. K.; Liu, Y.; Danesh, J.; Rasheed, A.; Mason, M. A.; Zonderman, A. B.; Franke, L.; Kristal, B. S.; Karjalainen, J.; Reed, D. R.; Westra, H-J; Evans, M. K.; Saleheen, D.; Harris, T. B.; Dedoussis, G.; Curhan, G.; Stumvoll, M.; Beilby, J.; Pasquale, L. R.; Feenstra, B.; Bandinelli, S.; Ordovas, J. M.; Chan, A. T.; Peters, U.; Ohlsson, C.; Gieger, C.; Martin, N. G.; Waldenberger, M.; Siscovick, D. S.; Raitakari, O.; Eriksson, J. G.; Mitchell, P.; Hunter, D. J.; Kraft, P.; Rimm, E. B.; Boomsma, D. I.; Borecki, I. B.; Loos, R. J. F.; Wareham, N. J.; Vollenweider, P.; Caporaso, N.; Grabe, H. J.; Neuhouser, M. L.; Wolffenbuttel, B. H. R.; Hu, F. B.; Hyppoenen, E.; Jarvelin, M-R; Cupples, L. A.; Franks, P. W.; Ridker, P. M.; van Duijn, C. M.; Heiss, G.; Metspalu, A.; North, K. E.; Ingelsson, E.; Nettleton, J. A.; van Dam, R. M.; Chasman, D. I.

    2015-01-01

    Coffee, a major dietary source of caffeine, is among the most widely consumed beverages in the world and has received considerable attention regarding health risks and benefits. We conducted a genome-wide (GW) meta-analysis of predominately regular-type coffee consumption (cups per day) among up to

  14. Genome-wide meta-analysis identifies six novel loci associated with habitual coffee consumption

    NARCIS (Netherlands)

    M. Cornelis (Marilyn); E.M. Byrne; T. Esko (Tõnu); M.A. Nalls (Michael); A. Ganna (Andrea); N.P. Paynter (Nina); K.L. Monda (Keri); N. Amin; K. Fischer (Krista); F. Renström (Frida); J.S. Ngwa; V. Huikari (Ville); A. Cavadino (Alana); I.M. Nolte (Ilja M.); A. Teumer (Alexander); K. Yu; P. Marques-Vidal; R. Rawal; A. Manichaikul (Ani); M.K. Wojczynski (Mary ); J.M. Vink; J.H. Zhao; G. Burlutsky (George); J. Lahti (Jari); V. Mikkilä (Vera); R.N. Lemaitre (Rozenn ); J. Eriksson; S. Musani (Solomon); T. Tanaka; F. Geller (Frank); J. Luan; J. Hui; R. Mägi (Reedik); M. Dimitriou (Maria); M. Garcia (Melissa); W.-K. Ho; M.J. Wright (Margaret); L.M. Rose (Lynda M.); P.K.E. Magnusson (Patrik K. E.); N.L. Pedersen (Nancy L.); D.J. Couper (David); B.A. Oostra (Ben); A. Hofman (Albert); M.A. Ikram (Arfan); H.W. Tiemeier (Henning); A.G. Uitterlinden (André); F.J.A. van Rooij (Frank); I. Barroso; I. Johansson (Ingegerd); L. Xue (Luting); M. Kaakinen (Marika); L. Milani (Lili); C. Power (Christine); H. Snieder (Harold); R.P. Stolk; S.E. Baumeister (Sebastian); R. Biffar; F. Gu; F. Bastardot (Francois); Z. Kutalik; D.R. Jacobs (David); N.G. Forouhi (Nita G.); E. Mihailov (Evelin); L. Lind (Lars); C. Lindgren; K. Michaëlsson; A.P. Morris (Andrew); M.K. Jensen (Majken K.); K.T. Khaw; R.N. Luben (Robert); J.J. Wang; S. Männistö (Satu); M.-M. Perälä; M. Kähönen (Mika); T. Lehtimäki (Terho); J. Viikari (Jorma); D. Mozaffarian; K. Mukamal (Kenneth); B.M. Psaty (Bruce); A. Döring; A.C. Heath (Andrew C.); G.W. Montgomery (Grant W.); N. Dahmen (N.); T. Carithers; K.L. Tucker; L. Ferrucci (Luigi); H.A. Boyd; M. Melbye (Mads); J.L. Treur; D. Mellström (Dan); J.J. Hottenga (Jouke Jan); I. Prokopenko (Inga); A. Tönjes (Anke); P. Deloukas (Panagiotis); S. Kanoni (Stavroula); M. Lorentzon (Mattias); D.K. Houston; Y. Liu; J. Danesh (John); A. Rasheed; M.A. Mason; A.B. Zonderman; L. Franke (Lude); B.S. Kristal; J. Karjalainen (Juha); D.R. Reed; H.-J. Westra; M.K. Evans; D. Saleheen; T.B. Harris (Tamara B.); G.V. Dedoussis (George V.); G.C. Curhan (Gary); M. Stumvoll (Michael); J. Beilby (John); L.R. Pasquale; B. Feenstra; S. Bandinelli; J.M. Ordovas; A.T. Chan; U. Peters (Ulrike); C. Ohlsson (Claes); C. Gieger (Christian); N.G. Martin (Nicholas); M. Waldenberger (Melanie); D.S. Siscovick (David); O. Raitakari (Olli); J.G. Eriksson (Johan G.); P. Mitchell (Paul); D. Hunter (David); P. Kraft (Peter); E.B. Rimm (Eric B.); D.I. Boomsma (Dorret); I.B. Borecki (Ingrid); R.J.F. Loos (Ruth J.F.); N.J. Wareham (Nick); P.K. Vollenweider (Peter K.); N. Caporaso; H.J. Grabe (Hans Jörgen); M.L. Neuhouser (Marian L.); B.H.R. Wolffenbuttel (Bruce H. R.); F.B. Hu (Frank); E. Hypponen (Elina); M.-R. Jarvelin (Marjo-Riitta); L.A. Cupples (Adrienne); P.W. Franks; P.M. Ridker (Paul); C.M. Van Duijn (Cornelia M.); G. Heiss (Gerardo); A. Metspalu (Andres); K.E. North (Kari); E. Ingelsson (Erik); J.A. Nettleton; R.M. van Dam (Rob); D.I. Chasman (Daniel)

    2015-01-01

    textabstractCoffee, a major dietary source of caffeine, is among the most widely consumed beverages in the world and has received considerable attention regarding health risks and benefits. We conducted a genome-wide (GW) meta-analysis of predominately regular-type coffee consumption (cups per day)

  15. MultiMetEval : Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    NARCIS (Netherlands)

    Zakrzewski, Piotr; Medema, Marnix H.; Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko; Fong, Stephen S.

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the co

  16. Genome-wide association analysis identifies three new susceptibility loci for childhood body mass index

    DEFF Research Database (Denmark)

    Felix, Janine F; Bradfield, Jonathan P; Monnereau, Claire;

    2016-01-01

    A large number of genetic loci are associated with adult body mass index. However, the genetics of childhood body mass index are largely unknown. We performed a meta-analysis of genome-wide association studies of childhood body mass index, using sex- and age-adjusted standard deviation scores. We...

  17. Genome Sequence of Babesia bovis and Camparative Analysis of Apicomplexan Hemoprotozoa

    Science.gov (United States)

    Babesia bovis is an apicomplexan tick-transmitted pathogen of cattle imposing a global risk and severe constraints to livestock health and economic development. The complete genome sequence was undertaken to facilitate vaccine antigen discovery, and to allow for comparative analysis with the related...

  18. Application of Genome-Wide Expression Analysis To Identify Molecular Markers Useful in Monitoring Industrial Fermentations

    OpenAIRE

    Higgins, Vincent J.; Rogers, Peter J.; Dawes, Ian W.

    2003-01-01

    Genome-wide expression analysis of an industrial strain of Saccharomyces cerevisiae identified the YOR387c and YGL258w homologues as highly inducible in zinc-depleted conditions. Induction was specific for zinc deficiency and was dependent on Zap1p. The results indicate that these sequences may be valuable molecular markers for detecting zinc deficiency in industrial fermentations.

  19. Genome-wide association analysis identifies six new loci associated with forced vital capacity

    NARCIS (Netherlands)

    D.W. Loth (Daan); M.S. Artigas; S.A. Gharib (Sina); L.V. Wain (Louise); N. Franceschini (Nora); B. Koch (Beate); T.D. Pottinger (Tess); G.D. Smith; Q. Duan (Qing); C. Oldmeadow (Christopher); M.K. Lee (Mi Kyeong); D.P. Strachan (David); A.L. James (Alan); J.E. Huffman (Jennifer); V. Vitart (Veronique); A. Ramasamy (Adaikalavan); N.J. Wareham (Nick); J. Kaprio (Jaakko); X.-Q. Wang (Xin-Qun); H. Trochet (Holly); M. Kähönen (Mika); C. Flexeder (Claudia); E. Albrecht (Eva); L.M. Lopez (Lorna); B. Thyagarajan (Bharat); A.C. Alves (Alexessander Couto); S. Enroth (Stefan); E. Omenaas (Ernst); P.K. Joshi (Peter); M. Fall (Magnus); A. Viñuela (Ana); L.J. Launer (Lenore); L.R. Loehr (Laura); M. Fornage (Myriam); G. Li (Guo); J.B. Wilk (Jemma); W. Tang (Wenbo); A. Manichaikul (Ani); L. Lahousse (Lies); T.B. Harris (Tamara); K.E. North (Kari); A.R. Rudnicka (Alicja); J. Hui (Jennie); X. Gu (Xiangjun); T. Lumley (Thomas); A.F. Wright (Alan); N. Hastie (Nick); S. Campbell (Susan); R. Kumar (Rajesh); I. Pin (Isabelle); R.A. Scott (Robert); K.H. Pietilainen (Kirsi Hannele); I. Surakka (Ida); Y. Liu (Yongmei); E.G. Holliday (Elizabeth); H. Schulz (Holger); J. Heinrich (Joachim); G. Davies (Gail); J.M. Vonk (Judith); M.K. Wojczynski (Mary ); A. Pouta (Anneli); A. Johansson (Åsa); S.H. Wild (Sarah); E. Ingelsson (Erik); F. Rivadeneira Ramirez (Fernando); H. Völzke (Henry); P.G. Hysi (Pirro); G. Eiriksdottir (Gudny); A.C. Morrison (Alanna); J.I. Rotter (Jerome); W. Gao (Wei); D.S. Postma (Dirkje); W.B. White (Wendy); S.S. Rich (Stephen); A. Hofman (Albert); T. Aspelund (Thor); D. Couper (David); L.J. Smith (Lewis); B.M. Psaty (Bruce); K. Lohman (Kurt); E.G. Burchard (Esteban); A.G. Uitterlinden (André); M. Garcia (Melissa); B.R. Joubert (Bonnie); W.L. McArdle (Wendy); A.W. Musk (Arthur); C.R.W. Hansel (Christian); S.R. Heckbert (Susan); L. Zgaga (Lina); J.B.J. van Meurs (Joyce); P. Navarro (Pau); I. Rudan (Igor); Y.-M. Oh (Yeon-Mok); S. Redline (Susan); D.L. Jarvis (Deborah); J.H. Zhao (Jing); T. Rantanen (Taina); G.T. O'Connor (George); S. Ripatti (Samuli); R.J. Scott (Rodney); S. Karrasch (Stefan); H. Grallert (Harald); N.C. Gaddis (Nathan); J.M. Starr (John); C. Wijmenga (Cisca); R.L. Minster (Ryan); C.W. Lederer (Carsten); J. Pekkanen (Juha); U. Gyllensten (Ulf); H. Campbell (Harry); A.P. Morris (Andrew); S. Gläser (Sven); C.J. Hammond (Christopher); K.M. Burkart (Kristin); J.P. Beilby (John); S.B. Kritchevsky (Stephen); V. Gudnason (Vilmundur); D.B. Hancock (Dana); O.D. Williams (Dale); O. Polasek (Ozren); T. Zemunik (Tatijana); I. Kolcic (Ivana); M.F. Petrini (Marcy); K.T. de Jong (Kim); M. Wjst (Matthias); W.H. Kim (Woo); D.J. Porteous (David J.); G. Scotland (Generation); B.H. Smith (Blair); A. Viljanen (Anne); M. Heliovaara (Markku); J. Attia (John); I. Sayers (Ian); R. Hampel (Regina); C. Gieger (Christian); I.J. Deary (Ian); H.M. Boezen (H. Marike); A.B. Newman (Anne); M.-R. Jarvelin (Marjo-Riitta); J.F. Wilson (James); L. Lind (Lars); B.H.Ch. Stricker (Bruno); A. Teumer (Alexander); T.D. Spector (Timothy); E. Melén (Erik); M.J. Peters (Marjolein); L.A. Lange (Leslie); R.G. Barr (Graham); K.R. Bracke (Ken); F.M. Verhamme (Fien); J. Sung (Joohon); P.S. Hiemstra (Pieter); P.A. Cassano (Patricia); A. Sood (Akshay); C. Hayward (Caroline); J. Dupuis (Josée); I.P. Hall (Ian); G.G. Brusselle (Guy); M.D. Tobin (Martin); S.J. London (Stephanie)

    2014-01-01

    textabstractForced vital capacity (FVC), a spirometric measure of pulmonary function, reflects lung volume and is used to diagnose and monitor lung diseases. We performed genome-wide association study meta-analysis of FVC in 52,253 individuals from 26 studies and followed up the top associations in

  20. Genome-wide association analysis identifies six new loci associated with forced vital capacity

    NARCIS (Netherlands)

    Loth, Daan W.; Artigas, Maria Soler; Gharib, Sina A.; Wain, Louise V.; Franceschini, Nora; Koch, Beate; Pottinger, Tess D.; Smith, Albert Vernon; Duan, Qing; Oldmeadow, Chris; Lee, Mi Kyeong; Strachan, David P.; James, Alan L.; Huffman, Jennifer E.; Vitart, Veronique; Ramasamy, Adaikalavan; Wareham, Nicholas J.; Kaprio, Jaakko; Wang, Xin-Qun; Trochet, Holly; Kaonen, Mika; Flexeder, Claudia; Albrecht, Eva; Lopez, Lorna M.; de Jong, Kim; Thyagarajan, Bharat; Alves, Alexessander Couto; Enroth, Stefan; Omenaas, Ernst; Joshi, Peter K.; Fall, Tove; Vinuela, Ana; Launer, Lenore J.; Loehr, Laura R.; Fornage, Myriam; Li, Guo; Wik, Jemma B.; Tang, Wenbo; Manichaikul, Ani; Lahousse, Lies; Harris, Tamara B.; North, Kari E.; Rudnicka, Alicja R.; Hui, Jennie; Gu, Xiangjun; Lumley, Thomas; Wright, Alan F.; Hastie, Nicholas D.; Campbell, Susan; Kumar, Rajesh; Pin, Isabelle; Scott, Robert A.; Pietilainen, Kirsi H.; Surakka, Ida; Liu, Yongmei; Holliday, Elizabeth G.; Schulz, Holger; Heinrich, Joachim; Davies, Gail; Vonk, Judith M.; Wojczynski, Mary; Pouta, Anneli; Johansson, Asa; Wild, Sarah H.; Ingelsson, Erik; Rivadeneira, Fernando; Voezke, Henry; Hysi, Pirro G.; Eiriksdottir, Gudny; Morrison, Alanna C.; Rotter, Jerome I.; Gao, Wei; Postma, Dirkje S.; White, Wendy B.; Rich, Stephen S.; Hofman, Albert; Aspelund, Thor; Couper, David; Smith, Lewis J.; Psaty, Bruce M.; Lohman, Kurt; Burchard, Esteban G.; Uitterlinden, Andre G.; Garcia, Melissa; Joubert, Bonnie R.; McArdle, Wendy L.; Musk, A. Bill; Hansel, Nadia; Heckbert, Susan R.; Zgaga, Lina; van Meurs, Joyce B. J.; Navarro, Pau; Rudan, Igor; Oh, Yeon-Mok; Redline, Susan; Jarvis, Deborah L.; Rantanen, Taina; O'Connor, George T.; Ripatti, Samuli; Scott, Rodney J.; Karrasch, Stefan; Grallert, Harald; Gaddis, Nathan C.; Starr, John M.; Wijmenga, Cisca; Minster, Ryan L.; Lederer, David J.; Pekkanen, Juha; Gyllensten, Ulf; Campbe, Harry; Morris, Andrew P.; Glaeser, Sven; Hammond, Christopher J.; Burkart, Kristin M.; Beilby, John; Kritchevsky, Stephen B.; Gucinason, Vilrnundur; Hancock, Dana B.; Williams, Dale; Polasek, Ozren; Zemunik, Tatijana; Kolcic, Ivana; Petrini, Marcy F.; Wjst, Matthias; Kim, Woo Jin; Porteous, David J.; Scotland, Generation; Smith, Blair H.; Villanen, Anne; Heliovaara, Markku; Attia, John R.; Sayers, Ian; Hampel, Regina; Gieger, Christian; Deary, Ian J.; Boezen, Hendrika; Newman, Anne; Jarvelin, Marjo-Riitta; Wilson, James F.; Lind, Lars; Stricker, Bruno H.; Teumer, Alexander; Spector, Timothy D.; Melen, Erik; Peters, Marjolein J.; Lange, Leslie A.; Barr, R. Graham; Bracke, Ken R.; Verhamme, Fien M.; Sung, Joohon; Hiemstra, Pieter S.; Cassano, Patricia A.; Sood, Akshay; Hayward, Caroline; Dupuis, Josee; Hall, Ian P.; Brusselle, Guy G.; Tobin, Martin D.; London, Stephanie J.

    2014-01-01

    Forced vital capacity (FVC), a spirometric measure of pulmonary function, reflects lung volume and is used to diagnose and monitor lung diseases. We performed genome-wide association study meta-analysis of FVC in 52,253 individuals from 26 studies and followed up the top associations in 32,917 addit

  1. Comparative genomic analysis of two-component regulatory proteins in Pseudomonas syringae

    DEFF Research Database (Denmark)

    Lavin, J.L.; Kiil, Kristoffer; Resano, O.;

    2007-01-01

    1448A were found to contain a large number of genes encoding TCS proteins, and a core of complete TCS proteins were shared between these genomes: 30 putative TCS clusters, 11 orphan HKs, 33 orphan RRs, and 16 hybrid HKs. A close analysis of the distribution of genes encoding TCS proteins revealed...

  2. Genome-wide association scan meta-analysis identifies three loci influencing adiposity and fat distribution

    NARCIS (Netherlands)

    C.M. Lindgren (Cecilia); I.M. Heid (Iris); J.C. Randall (Joshua); C. Lamina (Claudia); V. Steinthorsdottir (Valgerdur); L. Qi (Lu); E.K. Speliotes (Elizabeth); G. Thorleifsson (Gudmar); C.J. Willer (Cristen); B.M. Herrera (Blanca); A.U. Jackson (Anne); N. Lim (Noha); P. Scheet (Paul); N. Soranzo (Nicole); N. Amin (Najaf); Y.S. Aulchenko (Yurii); J.C. Chambers (John); A. Drong (Alexander); J. Luan; H.N. Lyon (Helen); F. Rivadeneira Ramirez (Fernando); S. Sanna (Serena); N. Timpson (Nicholas); M.C. Zillikens (Carola); H.Z. Jing; P. Almgren (Peter); S. Bandinelli (Stefania); A.J. Bennett (Amanda); R.N. Bergman (Richard); L.L. Bonnycastle (Lori); S. Bumpstead (Suzannah); S.J. Chanock (Stephen); L. Cherkas (Lynn); P.S. Chines (Peter); L. Coin (Lachlan); C. Cooper (Charles); G. Crawford (Gabe); A. Doering (Angela); A. Dominiczak (Anna); A.S.F. Doney (Alex); S. Ebrahim (Shanil); P. Elliott (Paul); M.R. Erdos (Michael); K. Estrada Gil (Karol); L. Ferrucci (Luigi); G. Fischer (Guido); N.G. Forouhi (Nita); C. Gieger (Christian); H. Grallert (Harald); C.J. Groves (Christopher); S.M. Grundy (Scott); C. Guiducci (Candace); D. Hadley (David); A. Hamsten (Anders); A.S. Havulinna (Aki); A. Hofman (Albert); R. Holle (Rolf); J.W. Holloway (John); T. Illig (Thomas); B. Isomaa (Bo); L.C. Jacobs (Leonie); K. Jameson (Karen); P. Jousilahti (Pekka); F. Karpe (Fredrik); J. Kuusisto (Johanna); J. Laitinen (Jaana); G.M. Lathrop (Mark); D.A. Lawlor (Debbie); M. Mangino (Massimo); W.L. McArdle (Wendy); T. Meitinger (Thomas); M.A. Morken (Mario); A.P. Morris (Andrew); P. Munroe (Patricia); N. Narisu (Narisu); A. Nordström (Anna); B.A. Oostra (Ben); C.N.A. Palmer (Colin); F. Payne (Felicity); J. Peden (John); I. Prokopenko (Inga); F. Renström (Frida); A. Ruokonen (Aimo); V. Salomaa (Veikko); M.S. Sandhu (Manjinder); L.J. Scott (Laura); A. Scuteri (Angelo); K. Silander (Kaisa); K. Song (Kijoung); X. Yuan (Xin); H.M. Stringham (Heather); A.J. Swift (Amy); T. Tuomi (Tiinamaija); M. Uda (Manuela); P. Vollenweider (Peter); G. Waeber (Gérard); C. Wallace (Chris); G.B. Walters (Bragi); M.N. Weedon (Michael); J.C.M. Witteman (Jacqueline); C. Zhang (Cuilin); M. Caulfield (Mark); F.S. Collins (Francis); G.D. Smith; I.N.M. Day (Ian); P.W. Franks (Paul); A.T. Hattersley (Andrew); F.B. Hu (Frank); M.R. Jarvelin; A. Kong (Augustine); J.S. Kooner (Jaspal); M. Laakso (Markku); E. Lakatta (Edward); V. Mooser (Vincent); L. Peltonen (Leena Johanna); N.J. Samani (Nilesh); T.D. Spector (Timothy); D.P. Strachan (David); T. Tanaka (Toshiko); J. Tuomilehto (Jaakko); A.G. Uitterlinden (André); P. Tikka-Kleemola (Päivi); N.J. Wareham (Nick); H. Watkins (Hugh); D. Waterworth (Dawn); M. Boehnke (Michael); P. Deloukas (Panagiotis); L. Groop (Leif); D.J. Hunter (David); U. Thorsteinsdottir (Unnur); D. Schlessinger (David); H.E. Wichmann (Erich); T.M. Frayling (Timothy); G.R. Abecasis (Gonçalo); J.N. Hirschhorn (Joel); R.J.F. Loos (Ruth); J-A. Zwart (John-Anker); K.L. Mohlke (Karen); I. Barroso (Inês); M.I. McCarthy (Mark)

    2009-01-01

    textabstractTo identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist-hip ratio (WHR). We selected 26 SNPs for follow-up, for which the evid

  3. Comparative genome analysis and resistance gene mapping in grain legumes

    International Nuclear Information System (INIS)

    Using, DNA markers and genome organization, several important disease resistance genes have been analyzed in mungbean (Vigna radiata), cowpea (Vigna unguiculata), common bean (Phaseolus vulgaris), and soybean (Glycine max). In the process, medium-density linkage maps consisting of restriction fragment length polymorphism (RFLP) markers were constructed for both mungbean and cowpea. Comparisons between these maps, as well as the maps of soybean and common bean, indicate that there is significant conservation of DNA marker order, though the conserved blocks in soybean are much shorter than in the others. DNA mapping results also indicate that a gene for seed weight may be conserved between mungbean and cowpea. Using the linkage maps, genes that control bruchid (genus Callosobruchus) and powdery mildew (Erysiphe polygoni) resistance in mungbean, aphid resistance in cowpea (Aphis craccivora), and cyst nematode (Heterodera glycines) resistance in soybean have all been mapped and characterized. For some of these traits resistance was found to be oligogenic and DNA mapping uncovered multiple genes involved in the phenotype. (author)

  4. Metabolomic Functional Analysis of Bacterial Genomes: Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Arp, Daniel J; Sayavedra-Soto, Luis A

    2008-01-01

    The availability of the complete DNA sequence of the bacterial genome of Nitrosomonas europaea offered the opportunity for unprecedented and detailed investigations of function. We studied the function of genes involved in carbohydrate and Fe metabolism. N. europaea has genes for the synthesis and degradation of glycogen and sucrose but cannot grow on substrates other than ammonia and CO2. Granules of glycogen were detected in whole cells by electron microscopy and quantified in cell-free extracts by enzymatic methods. The cellular glycogen and sucrose content varied depending on the composition of the growth medium and cellular growth stage. N. europaea also depends heavily on iron for metabolism of ammonia, is particularly interesting since it lacks genes for siderophore production, and has genes with only low similarity to known iron reductases, yet grows relatively well in medium containing low Fe. By comparing the transcriptomes of cells grown in iron-replete medium versus iron-limited medium, 247 genes were identified as differentially expressed. Mutant strains deficient in genes for sucrose, glycogen and iron metabolism were created and are being used to further our understanding of ammonia oxidizing bacteria.

  5. Investigating hookworm genomes by comparative analysis of two Ancylostoma species

    Directory of Open Access Journals (Sweden)

    Kapulkin Wadim

    2005-04-01

    Full Text Available Abstract Background Hookworms, infecting over one billion people, are the mostly closely related major human parasites to the model nematode Caenorhabditis elegans. Applying genomics techniques to these species, we analyzed 3,840 and 3,149 genes from Ancylostoma caninum and A. ceylanicum. Results Transcripts originated from libraries representing infective L3 larva, stimulated L3, arrested L3, and adults. Most genes are represented in single stages including abundant transcripts like hsp-20 in infective L3 and vit-3 in adults. Over 80% of the genes have homologs in C. elegans, and nearly 30% of these were with observable RNA interference phenotypes. Homologies were identified to nematode-specific and clade V specific gene families. To study the evolution of hookworm genes, 574 A. caninum / A. ceylanicum orthologs were identified, all of which were found to be under purifying selection with distribution ratios of nonsynonymous to synonymous amino acid substitutions similar to that reported for C. elegans / C. briggsae orthologs. The phylogenetic distance between A. caninum and A. ceylanicum is almost identical to that for C. elegans / C. briggsae. Conclusion The genes discovered should substantially accelerate research toward better understanding of the parasites' basic biology as well as new therapies including vaccines and novel anthelmintics.

  6. Genomic analysis of gum disease and hypertrichosis in foxes.

    Science.gov (United States)

    Clark, J-A B J; Whalen, D; Marshall, H D

    2016-01-01

    Since the 1940s, a proliferative gingival disease called hereditary hyperplastic gingivitis (HHG) has been described in the farmed silver fox, Vulpes vulpes (Dyrendahl and Henricson 1960). HHG displays an autosomal recessive transmission and has a pleiotropic relationship with superior fur quality in terms of length and thickness of guard hairs. An analogous human disease, hereditary gingival fibromatosis (HGF), is characterized by a predominantly autosomal dominant transmission and a complex etiology, occurring either as an isolated condition or as a part of a syndrome. Similar to HHG, the symptom most commonly associated with syndromic HGF is hypertrichosis. Here we explore potential mechanisms involved in HHG by comparison to known genetic information about hypertrichosis co-occurring with HGF, using an Affymetrix canine genome microarray platform, quantitative PCR, and candidate gene sequencing. We conclude that the mitogen-activated protein kinase pathway is involved in HHG, however despite involvement of the mitogen-activated protein kinase kinase 6 gene in congenital hypertrichosis with gingival fibromatosis in humans, this gene did not contain any fixed mutations in exons or exon-intron boundaries in HHG-affected foxes, suggesting that it is not causative of HHG in the farmed silver fox population. Differential up-regulation of MAP2K6 gene in HHG-affected foxes does implicate this gene in the HHG phenotype. PMID:27323055

  7. Incorporation of Socio-Economic Features' Ranking in Multicriteria Analysis Based on Ecosystem Services for Marine Protected Area Planning.

    Directory of Open Access Journals (Sweden)

    Michelle E Portman

    Full Text Available Developed decades ago for spatial choice problems related to zoning in the urban planning field, multicriteria analysis (MCA has more recently been applied to environmental conflicts and presented in several documented cases for the creation of protected area management plans. Its application is considered here for the development of zoning as part of a proposed marine protected area management plan. The case study incorporates specially-explicit conservation features while considering stakeholder preferences, expert opinion and characteristics of data quality. It involves the weighting of criteria using a modified analytical hierarchy process. Experts ranked physical attributes which include socio-economically valued physical features. The parameters used for the ranking of (physical attributes important for socio-economic reasons are derived from the field of ecosystem services assessment. Inclusion of these feature values results in protection that emphasizes those areas closest to shore, most likely because of accessibility and familiarity parameters and because of data biases. Therefore, other spatial conservation prioritization methods should be considered to supplement the MCA and efforts should be made to improve data about ecosystem service values farther from shore. Otherwise, the MCA method allows incorporation of expert and stakeholder preferences and ecosystem services values while maintaining the advantages of simplicity and clarity.

  8. Incorporation of Socio-Economic Features' Ranking in Multicriteria Analysis Based on Ecosystem Services for Marine Protected Area Planning.

    Science.gov (United States)

    Portman, Michelle E; Shabtay-Yanai, Ateret; Zanzuri, Asaf

    2016-01-01

    Developed decades ago for spatial choice problems related to zoning in the urban planning field, multicriteria analysis (MCA) has more recently been applied to environmental conflicts and presented in several documented cases for the creation of protected area management plans. Its application is considered here for the development of zoning as part of a proposed marine protected area management plan. The case study incorporates specially-explicit conservation features while considering stakeholder preferences, expert opinion and characteristics of data quality. It involves the weighting of criteria using a modified analytical hierarchy process. Experts ranked physical attributes which include socio-economically valued physical features. The parameters used for the ranking of (physical) attributes important for socio-economic reasons are derived from the field of ecosystem services assessment. Inclusion of these feature values results in protection that emphasizes those areas closest to shore, most likely because of accessibility and familiarity parameters and because of data biases. Therefore, other spatial conservation prioritization methods should be considered to supplement the MCA and efforts should be made to improve data about ecosystem service values farther from shore. Otherwise, the MCA method allows incorporation of expert and stakeholder preferences and ecosystem services values while maintaining the advantages of simplicity and clarity. PMID:27183224

  9. Integrated analysis of gene expression and genomic aberration data in osteosarcoma (OS).

    Science.gov (United States)

    Xiong, Y; Wu, S; Du, Q; Wang, A; Wang, Z

    2015-11-01

    Cytogenetic analyses have revealed that complex karyotypes with numerous and highly variable genomic aberrations including single-nucleotide polymorphisms (SNPs) and copy number variants (CNVs), are observed in most of the conventional osteosarcomas (OSs). Several genome-wide studies have reported that the dysregulated expression of many genes is correlated with genomic aberrations in OS. We first compared OS gene expression in Gene Expression Omnibus (GEO) data sets and genomic aberrations in International Cancer Genome Consortium (ICGC) database to identify differentially expressed genes (DEGs) associated with SNPs or CNVs in OS. Then the function annotation of SNP- or CNV-associated DEGs was performed in terms of gene ontology analysis, pathway analysis and protein-protein interactions (PPIs). Finally, the expression of genes correlated with both SNPs and CNVs were confirmed by quantitative reverse-transcription PCR. Eight publicly available GEO data sets were obtained, and a set of 979 DEGs were identified (472 upregulated and 507 downregulated DEGs). Moreover, we obtained 1039 SNPs mapped in 938 genes, and 583 CNV sites mapped in 2915 genes. Comparing genomic aberrations and DGEs, we found 41 SNP-associated DEGs and 124 CNV-associated DEGs, in which 7 DGEs were associated with both SNPs and CNVs, including WWP1, EXT1, LDHB, C8orf59, PLEKHA5, CCT3 and VWF. The result of function annotation showed that ossification, bone development and skeletal system development were the significantly enriched terms of biological processes for DEGs. PPI network analysis showed that CCT3, COPS3 and WWP1 were the significant hub proteins. We conclude that these genes, including CCT3, COPS3 and WWP1 are candidate driver genes of importance in OS tumorigenesis. PMID:26427513

  10. Parametric study on single shot peening by dimensional analysis method incorporated with finite element method

    Institute of Scientific and Technical Information of China (English)

    Xian-Qian Wu; Xi Wang; Yan-Peng Wei; Hong-Wei Song; Chen-Guang Huang

    2012-01-01

    Shot peening is a widely used surface treatment method by generating compressive residual stress near the surface of metallic materials to increase fatigue life and resistance to corrosion fatigue,cracking,etc.Compressive residual stress and dent profile are important factors to evaluate the effectiveness of shot peening process.In this paper,the influence of dimensionless parameters on maximum compressive residual stress and maximum depth of the dent were investigated.Firstly,dimensionless relations of processing parameters that affect the maximum compressive residual stress and the maximum depth of the dent were deduced by dimensional analysis method.Secondly,the influence of each dimensionless parameter on dimensionless variables was investigated by the finite element method.Furthermore,related empirical formulas were given for each dimensionless parameter based on the simulation results.Finally,comparison was made and good agreement was found between the simulation results and the empirical formula,which shows that a useful approach is provided in this paper for analyzing the influence of each individual parameter.

  11. Wind characterization analysis incorporating genetic algorithm: A case study in Taiwan Strait

    International Nuclear Information System (INIS)

    In this paper, the genetic algorithm (GA) is originally applied to compute the Weibull parameters for wind characterization analysis, in which an objective function required in GA for searching optimization solution has been first defined as well. Wind data analyzed are observed at a wind farm in the Taiwan Strait from 2006 to 2008. To accurately describe wind speed distribution three kinds of probability density functions are compared, i.e. the Weibull, logistic and lognormal functions. Statistical parameters including the max error in the Kolmogorov-Smirnov test, root mean square error, Chi-square error and relative error of wind power density are considered as judgment criterions. The results show that GA is a useful method, there is about 33% time saving when compared with conventional iteration method. Weibull function describes best the wind distribution, regardless of time periods. Accordingly, wind power density, availability factor and electrical energy output from an ideal turbine are assessed using the Weibull parameters; utilization rate of wind energy for the currently used turbine is discussed. Further the wind energy compensates very well with solar energy; when solar radiation is down in winter and spring, the wind power becomes greater; energy ratios for each month are calculated lastly. -- Highlights: → The genetic algorithm was applied for the first time to calculate the Weibull parameters for wind energy assessment. → Weibull probability function fits the observed wind speed distribution better than both logistic and lognormal functions. → Wind and solar energy potential in Taiwan show a great complementary relationship.

  12. Analysis of Factors for Incorporating User Preferences in Air Traffic Management: A system Perspective

    Science.gov (United States)

    Sheth, Kapil S.; Gutierrez-Nolasco, Sebastian

    2010-01-01

    This paper presents an analysis of factors that impact user flight schedules during air traffic congestion. In pre-departure flight planning, users file one route per flight, which often leads to increased delays, inefficient airspace utilization, and exclusion of user flight preferences. In this paper, first the idea of filing alternate routes and providing priorities on each of those routes is introduced. Then, the impact of varying planning interval and system imposed departure delay increment is discussed. The metrics of total delay and equity are used for analyzing the impact of these factors on increased traffic and on different users. The results are shown for four cases, with and without the optional routes and priority assignments. Results demonstrate that adding priorities to optional routes further improves system performance compared to filing one route per flight and using first-come first-served scheme. It was also observed that a two-hour planning interval with a five-minute system imposed departure delay increment results in highest delay reduction. The trend holds for a scenario with increased traffic.

  13. Genome analysis of rice-blast fungus Magnaporthe oryzae field isolates from southern India

    OpenAIRE

    Malali Gowda; Meghana D. Shirke; Mahesh, H. B.; Pinal Chandarana; Anantharamanan Rajamani; Chattoo, Bharat B

    2015-01-01

    The Indian subcontinent is the center of origin and diversity for rice (Oryza sativa L.). The O. sativa ssp. indica is a major food crop grown in India, which occupies the first and second position in area and production, respectively. Blast disease caused by Magnaporthe oryzae is a major constraint to rice production. Here, we report the analysis of genome architecture and sequence variation of two field isolates, B157 and MG01, of the blast fungus from southern India. The 40 Mb genome of B1...

  14. An overview of the Phalaenopsis orchid genome through BAC end sequence analysis

    Directory of Open Access Journals (Sweden)

    Hsiao Yu-Yun

    2011-01-01

    Full Text Available Abstract Background Phalaenopsis orchids are popular floral crops, and development of new cultivars is economically important to floricultural industries worldwide. Analysis of orchid genes could facilitate orchid improvement. Bacterial artificial chromosome (BAC end sequences (BESs can provide the first glimpses into the sequence composition of a novel genome and can yield molecular markers for use in genetic mapping and breeding. Results We used two BAC libraries (constructed using the BamHI and HindIII restriction enzymes of Phalaenopsis equestris to generate pair-end sequences from 2,920 BAC clones (71.4% and 28.6% from the BamHI and HindIII libraries, respectively, at a success rate of 95.7%. A total of 5,535 BESs were generated, representing 4.5 Mb, or about 0.3% of the Phalaenopsis genome. The trimmed sequences ranged from 123 to 1,397 base pairs (bp in size, with an average edited read length of 821 bp. When these BESs were subjected to sequence homology searches, it was found that 641 (11.6% were predicted to represent protein-encoding regions, whereas 1,272 (23.0% contained repetitive DNA. Most of the repetitive DNA sequences were gypsy- and copia-like retrotransposons (41.9% and 12.8%, respectively, whereas only 10.8% were DNA transposons. Further, 950 potential simple sequence repeats (SSRs were discovered. Dinucleotides were the most abundant repeat motifs; AT/TA dimer repeats were the most frequent SSRs, representing 253 (26.6% of all identified SSRs. Microsynteny analysis revealed that more BESs mapped to the whole-genome sequences of poplar than to those of grape or Arabidopsis, and even fewer mapped to the rice genome. This work will facilitate analysis of the Phalaenopsis genome, and will help clarify similarities and differences in genome composition between orchids and other plant species. Conclusion Using BES analysis, we obtained an overview of the Phalaenopsis genome in terms of gene abundance, the presence of repetitive

  15. Genome-wide association analysis identifies 13 new risk loci for schizophrenia

    OpenAIRE

    Ripke, Stephan; O'Dushlaine, Colm; Chambert, Kimberly; Moran, Jennifer L; Kähler, Anna K.; Akterin, Susanne; Bergen, Sarah E.; Collins, Ann L.; Crowley, James J.; Fromer, Menachem; Kim, Yunjung; Lee, Sang Hong; Magnusson, Patrik K E; Sanchez, Nick; Stahl, Eli A

    2013-01-01

    Schizophrenia is an idiopathic mental disorder with a heritable component and a substantial public health impact. We conducted a multi-stage genome-wide association study (GWAS) for schizophrenia beginning with a Swedish national sample (5,001 cases and 6,243 controls) followed by meta-analysis with previous schizophrenia GWAS (8,832 cases and 12,067 controls) and finally by replication of SNPs in 168 genomic regions in independent samples (7,413 cases, 19,762 controls and 581 parent-offsprin...

  16. Genome-wide association analysis identifies 13 new risk loci for schizophrenia.

    OpenAIRE

    Ripke, Stephan; O'Dushlaine, Colm; Chambert, Kimberly; Moran, Jennifer L; Kähler, Anna K.; Akterin, Susanne; Bergen, Sarah E.; Collins, Ann L.; Crowley, James J.; Fromer, Menachem; Kim, Yunjung; Bender, Stephan; Collier, David; Crespo-Facorro, Benedicto; Hall, Jeremy

    2013-01-01

    Schizophrenia is an idiopathic mental disorder with a heritable component and a substantial public health impact. We conducted a multi-stage genome-wide association study (GWAS) for schizophrenia beginning with a Swedish national sample (5,001 cases and 6,243 controls) followed by meta-analysis with previous schizophrenia GWAS (8,832 cases and 12,067 controls) and finally by replication of SNPs in 168 genomic regions in independent samples (7,413 cases, 19,762 controls and 581 parent-offsprin...

  17. Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part II: Schizophrenia

    OpenAIRE

    Lewis, Cathryn M.; Levinson, Douglas F.; Wise, Lesley H.; Delisi, Lynn E.; Straub, Richard E.; Hovatta, Iiris; Williams, Nigel M.; Schwab, Sibylle G.; Pulver, Ann E; Faraone, Stephen V.; Brzustowicz, Linda M.; Kaufmann, Charles A.; Garver, David L.; Gurling, Hugh M.D.; Lindholm, Eva

    2003-01-01

    Schizophrenia is a common disorder with high heritability and a 10-fold increase in risk to siblings of probands. Replication has been inconsistent for reports of significant genetic linkage. To assess evidence for linkage across studies, rank-based genome scan meta-analysis (GSMA) was applied to data from 20 schizophrenia genome scans. Each marker for each scan was assigned to 1 of 120 30-cM bins, with the bins ranked by linkage scores (1 = most significant) and the ranks averaged across stu...

  18. Evaluation of Uncertainty in Runoff Analysis Incorporating Theory of Stochastic Process

    Science.gov (United States)

    Yoshimi, Kazuhiro; Wang, Chao-Wen; Yamada, Tadashi

    2015-04-01

    The aim of this paper is to provide a theoretical framework of uncertainty estimate on rainfall-runoff analysis based on theory of stochastic process. SDE (stochastic differential equation) based on this theory has been widely used in the field of mathematical finance due to predict stock price movement. Meanwhile, some researchers in the field of civil engineering have investigated by using this knowledge about SDE (stochastic differential equation) (e.g. Kurino et.al, 1999; Higashino and Kanda, 2001). However, th