WorldWideScience

Sample records for genome analysis progress

  1. Multi-platform genome-wide analysis of melanoma progression to brain metastasis

    Directory of Open Access Journals (Sweden)

    Diego M. Marzese

    2014-12-01

    Full Text Available Melanoma has a high tendency to metastasize to brain tissue. The understanding about the molecular alterations of early-stage melanoma progression to brain metastasis (MBM is very limited. Identifying MBM-specific genomic and epigenomic alterations is a key initial step in understanding its aggressive nature and identifying specific novel druggable targets. Here, we describe a multi-platform dataset generated with different stages of melanoma progression to MBM. This data includes genome-wide DNA methylation (Illumina HM450K BeadChip, gene expression (Affymetrix HuEx 1.0 ST array, single nucleotide polymorphisms (SNPs and copy number variation (CNV; Affymetrix SNP 6.0 array analyses of melanocyte cells (MNCs, primary melanoma tumors (PRMs, lymph node metastases (LNMs and MBMs. The analysis of this data has been reported in our recently published study (Marzese et al., 2014.

  2. Genome sequencing and analysis conferences. Progress report, August 15, 1993--August 15, 1994

    Energy Technology Data Exchange (ETDEWEB)

    Venter, J.C.

    1995-10-01

    The 14 plenary session presentations focused on nematode; yeast; fruit fly; plants; mycobacteria; and man. In addition there were presentations on a variety of technical innovations including database developments and refinements, bioelectronic genesensors, computer-assisted multiplex techniques, and hybridization analysis with DNA chip technology. This document includes only the session schedule.

  3. Co-expression module analysis reveals biological processes, genomic gain, and regulatory mechanisms associated with breast cancer progression

    Directory of Open Access Journals (Sweden)

    Derow Catherine K

    2010-05-01

    Full Text Available Abstract Background Gene expression signatures are typically identified by correlating gene expression patterns to a disease phenotype of interest. However, individual gene-based signatures usually suffer from low reproducibility and interpretability. Results We have developed a novel algorithm Iterative Clique Enumeration (ICE for identifying relatively independent maximal cliques as co-expression modules and a module-based approach to the analysis of gene expression data. Applying this approach on a public breast cancer dataset identified 19 modules whose expression levels were significantly correlated with tumor grade. The correlations were reproducible for 17 modules in an independent breast cancer dataset, and the reproducibility was considerably higher than that based on individual genes or modules identified by other algorithms. Sixteen out of the 17 modules showed significant enrichment in certain Gene Ontology (GO categories. Specifically, modules related to cell proliferation and immune response were up-regulated in high-grade tumors while those related to cell adhesion was down-regulated. Further analyses showed that transcription factors NYFB, E2F1/E2F3, NRF1, and ELK1 were responsible for the up-regulation of the cell proliferation modules. IRF family and ETS family proteins were responsible for the up-regulation of the immune response modules. Moreover, inhibition of the PPARA signaling pathway may also play an important role in tumor progression. The module without GO enrichment was found to be associated with a potential genomic gain in 8q21-23 in high-grade tumors. The 17-module signature of breast tumor progression clustered patients into subgroups with significantly different relapse-free survival times. Namely, patients with lower cell proliferation and higher cell adhesion levels had significantly lower risk of recurrence, both for all patients (p = 0.004 and for those with grade 2 tumors (p = 0.017. Conclusions The ICE

  4. Inferring tumor progression from genomic heterogeneity.

    Science.gov (United States)

    Navin, Nicholas; Krasnitz, Alexander; Rodgers, Linda; Cook, Kerry; Meth, Jennifer; Kendall, Jude; Riggs, Michael; Eberling, Yvonne; Troge, Jennifer; Grubor, Vladimir; Levy, Dan; Lundin, Pär; Månér, Susanne; Zetterberg, Anders; Hicks, James; Wigler, Michael

    2010-01-01

    Cancer progression in humans is difficult to infer because we do not routinely sample patients at multiple stages of their disease. However, heterogeneous breast tumors provide a unique opportunity to study human tumor progression because they still contain evidence of early and intermediate subpopulations in the form of the phylogenetic relationships. We have developed a method we call Sector-Ploidy-Profiling (SPP) to study the clonal composition of breast tumors. SPP involves macro-dissecting tumors, flow-sorting genomic subpopulations by DNA content, and profiling genomes using comparative genomic hybridization (CGH). Breast carcinomas display two classes of genomic structural variation: (1) monogenomic and (2) polygenomic. Monogenomic tumors appear to contain a single major clonal subpopulation with a highly stable chromosome structure. Polygenomic tumors contain multiple clonal tumor subpopulations, which may occupy the same sectors, or separate anatomic locations. In polygenomic tumors, we show that heterogeneity can be ascribed to a few clonal subpopulations, rather than a series of gradual intermediates. By comparing multiple subpopulations from different anatomic locations, we have inferred pathways of cancer progression and the organization of tumor growth.

  5. DOE Joint Genome Institute 2008 Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, David

    2009-03-12

    -based sequencing process that dominated how sequencing was done in the last decade is being replaced by a variety of new processes and sequencing instruments. The JGI, with an increasing number of next-generation sequencers, whose throughput is 100- to 1,000-fold greater than the Sanger capillary-based sequencers, is increasingly focused in new directions on projects of scale and complexity not previously attempted. These new directions for the JGI come, in part, from the 2008 National Research Council report on the goals of the National Plant Genome Initiative as well as the 2007 National Research Council report on the New Science of Metagenomics. Both reports outline a crucial need for systematic large-scale surveys of the plant and microbial components of the biosphere as well as an increasing need for large-scale analysis capabilities to meet the challenge of converting sequence data into knowledge. The JGI is extensively discussed in both reports as vital to progress in these fields of major national interest. JGI's future plan for plants and microbes includes a systematic approach for investigation of these organisms at a scale requiring the special capabilities of the JGI to generate, manage, and analyze the datasets. JGI will generate and provide not only community access to these plant and microbial datasets, but also the tools for analyzing them. These activities will produce essential knowledge that will be needed if we are to be able to respond to the world's energy and environmental challenges. As the JGI Plant and Microbial programs advance, the JGI as a user facility is also evolving. The Institute has been highly successful in bending its technical and analytical skills to help users solve large complex problems of major importance, and that effort will continue unabated. The JGI will increasingly move from a central focus on 'one-off' user projects coming from small user communities to much larger scale projects driven by systematic and problem

  6. Chromosome region-specific libraries for human genome analysis. Final progress report, 1 March 1991--28 February 1994

    Energy Technology Data Exchange (ETDEWEB)

    Kao, F.T.

    1994-04-01

    The objectives of this grant proposal include (1) development of a chromosome microdissection and PCR-mediated microcloning technology, (2) application of this microtechnology to the construction of region-specific libraries for human genome analysis. During this grant period, the authors have successfully developed this microtechnology and have applied it to the construction of microdissection libraries for the following chromosome regions: a whole chromosome 21 (21E), 2 region-specific libraries for the long arm of chromosome 2, 2q35-q37 (2Q1) and 2q33-q35 (2Q2), and 4 region-specific libraries for the entire short arm of chromosome 2, 2p23-p25 (2P1), 2p21-p23 (2P2), 2p14-p16 (wP3) and 2p11-p13 (2P4). In addition, 20--40 unique sequence microclones have been isolated and characterized for genomic studies. These region-specific libraries and the single-copy microclones from the library have been used as valuable resources for (1) isolating microsatellite probes in linkage analysis to further refine the disease locus; (2) isolating corresponding clones with large inserts, e.g. YAC, BAC, P1, cosmid and phage, to facilitate construction of contigs for high resolution physical mapping; and (3) isolating region-specific cDNA clones for use as candidate genes. These libraries are being deposited in the American Type Culture Collection (ATCC) for general distribution.

  7. DOE Joint Genome Institute 2008 Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, David

    2009-03-12

    -based sequencing process that dominated how sequencing was done in the last decade is being replaced by a variety of new processes and sequencing instruments. The JGI, with an increasing number of next-generation sequencers, whose throughput is 100- to 1,000-fold greater than the Sanger capillary-based sequencers, is increasingly focused in new directions on projects of scale and complexity not previously attempted. These new directions for the JGI come, in part, from the 2008 National Research Council report on the goals of the National Plant Genome Initiative as well as the 2007 National Research Council report on the New Science of Metagenomics. Both reports outline a crucial need for systematic large-scale surveys of the plant and microbial components of the biosphere as well as an increasing need for large-scale analysis capabilities to meet the challenge of converting sequence data into knowledge. The JGI is extensively discussed in both reports as vital to progress in these fields of major national interest. JGI's future plan for plants and microbes includes a systematic approach for investigation of these organisms at a scale requiring the special capabilities of the JGI to generate, manage, and analyze the datasets. JGI will generate and provide not only community access to these plant and microbial datasets, but also the tools for analyzing them. These activities will produce essential knowledge that will be needed if we are to be able to respond to the world's energy and environmental challenges. As the JGI Plant and Microbial programs advance, the JGI as a user facility is also evolving. The Institute has been highly successful in bending its technical and analytical skills to help users solve large complex problems of major importance, and that effort will continue unabated. The JGI will increasingly move from a central focus on 'one-off' user projects coming from small user communities to much larger scale projects driven by systematic and problem

  8. In situ quantification of genomic instability in breast cancer progression

    Energy Technology Data Exchange (ETDEWEB)

    Ortiz de Solorzano, Carlos; Chin, Koei; Gray, Joe W.; Lockett, Stephen J.

    2003-05-15

    Genomic instability is a hallmark of breast and other solid cancers. Presumably caused by critical telomere reduction, GI is responsible for providing the genetic diversity required in the multi-step progression of the disease. We have used multicolor fluorescence in situ hybridization and 3D image analysis to quantify genomic instability cell-by-cell in thick, intact tissue sections of normal breast epithelium, preneoplastic lesions (usual ductal hyperplasia), ductal carcinona is situ or invasive carcinoma of the breast. Our in situ-cell by cell-analysis of genomic instability shows an important increase of genomic instability in the transition from hyperplasia to in situ carcinoma, followed by a reduction of instability in invasive carcinoma. This pattern suggests that the transition from hyperplasia to in situ carcinoma corresponds to telomere crisis and invasive carcinoma is a consequence of telomerase reactivation afertelomere crisis.

  9. Genomic-Glycosylation Aberrations in Tumor Initiation, Progression and Management

    Directory of Open Access Journals (Sweden)

    Carman K.M. Ip

    2016-12-01

    Full Text Available Post-translation modifications of proteins alter their functional activity and thus are key contributors of tumor initiation and progression. Glycosylation, one of the most common post-translational modifications of proteins, has been associated with tumorigenesis for decades. However, due to complexity in analysis of the functional effects of glycosylation, definitive information on the role of altered glycosylation in cancer is lacking. Importantly, imputing changes in glycosylation in proteins from analysis of DNA mutations has not been attempted globally. It is thus critical to elucidate the role of glycosylation in tumor pathophysiology as well as potential roles of altered glycosylation as cancer biomarkers and therapeutic targets. In this review, we summarize the evidence that glycosylation regulates functions of a set of frequently mutated oncogenes and tumor suppressors. Moreover, we explore the potential that protein sequence changes engendered by genomic mutations broadly alter glycosylation and thus promote tumor initiation and progression.

  10. Bio-informatics Research Progress in the Post-genome Era Based on the Quantitative Analysis of SCIE

    Institute of Scientific and Technical Information of China (English)

    Yongqin; ZHAN; Min; YU

    2013-01-01

    SCIE paper output can reflect the status quo and trend of discipline research and 7 038 scientific articles concerning bioinformatics are retrieved in SCIE database during the years between 2008 and 2012. Quantitative analysis of paper output and citation frequency are conducted according to nations, institutions, publications, research direction as well as hot articles, which provides assistance for bioinformatics researchers to understand the present situation of this subject, carry out cooperative studies and display scientific research achievements.

  11. Human Genome Program Report. Part 1, Overview and Progress

    Science.gov (United States)

    1997-11-01

    This report contains Part 1 of a two-part report to reflect research and progress in the U.S. Department of Energy Human Genome Program from 1994 through 1996, with specified updates made just before publication. Part 1 consists of the program overview and report on progress.

  12. Human genome program report. Part 1, overview and progress

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-11-01

    This report contains Part 1 of a two-part report to reflect research and progress in the U.S. Department of Energy Human Genome Program from 1994 through 1996, with specified updates made just before publication. Part 1 consists of the program overview and report on progress.

  13. Genome evolution during progression to breast cancer

    KAUST Repository

    Newburger, D. E.

    2013-04-08

    Cancer evolution involves cycles of genomic damage, epigenetic deregulation, and increased cellular proliferation that eventually culminate in the carcinoma phenotype. Early neoplasias, which are often found concurrently with carcinomas and are histologically distinguishable from normal breast tissue, are less advanced in phenotype than carcinomas and are thought to represent precursor stages. To elucidate their role in cancer evolution we performed comparative whole-genome sequencing of early neoplasias, matched normal tissue, and carcinomas from six patients, for a total of 31 samples. By using somatic mutations as lineage markers we built trees that relate the tissue samples within each patient. On the basis of these lineage trees we inferred the order, timing, and rates of genomic events. In four out of six cases, an early neoplasia and the carcinoma share a mutated common ancestor with recurring aneuploidies, and in all six cases evolution accelerated in the carcinoma lineage. Transition spectra of somatic mutations are stable and consistent across cases, suggesting that accumulation of somatic mutations is a result of increased ancestral cell division rather than specific mutational mechanisms. In contrast to highly advanced tumors that are the focus of much of the current cancer genome sequencing, neither the early neoplasia genomes nor the carcinomas are enriched with potentially functional somatic point mutations. Aneuploidies that occur in common ancestors of neoplastic and tumor cells are the earliest events that affect a large number of genes and may predispose breast tissue to eventual development of invasive carcinoma.

  14. The Cassava Genome: Current Progress, Future Directions.

    Science.gov (United States)

    Prochnik, Simon; Marri, Pradeep Reddy; Desany, Brian; Rabinowicz, Pablo D; Kodira, Chinnappa; Mohiuddin, Mohammed; Rodriguez, Fausto; Fauquet, Claude; Tohme, Joseph; Harkins, Timothy; Rokhsar, Daniel S; Rounsley, Steve

    2012-03-01

    The starchy swollen roots of cassava provide an essential food source for nearly a billion people, as well as possibilities for bioenergy, yet improvements to nutritional content and resistance to threatening diseases are currently impeded. A 454-based whole genome shotgun sequence has been assembled, which covers 69% of the predicted genome size and 96% of protein-coding gene space, with genome finishing underway. The predicted 30,666 genes and 3,485 alternate splice forms are supported by 1.4 M expressed sequence tags (ESTs). Maps based on simple sequence repeat (SSR)-, and EST-derived single nucleotide polymorphisms (SNPs) already exist. Thanks to the genome sequence, a high-density linkage map is currently being developed from a cross between two diverse cassava cultivars: one susceptible to cassava brown streak disease; the other resistant. An efficient genotyping-by-sequencing (GBS) approach is being developed to catalog SNPs both within the mapping population and among diverse African farmer-preferred varieties of cassava. These resources will accelerate marker-assisted breeding programs, allowing improvements in disease-resistance and nutrition, and will help us understand the genetic basis for disease resistance.

  15. The genomic dynamics during progression of lung adenocarcinomas.

    Science.gov (United States)

    Yang, Bin; Luo, Longhai; Luo, Wen; Zhou, Yong; Yang, Chao; Xiong, Teng; Li, Xiangchun; Meng, Xuan; Li, Lin; Zhang, Xiaopin; Wang, Zhe; Wang, Zhixin

    2017-08-01

    Intra-tumor heterogeneity is a big barrier to precision medicine. To explore the underlying clonal diversity in lung adenocarcinomas, we selected nine individuals with whole-genome sequencing data from primary and matched metastatic tumors as a cohort for study. Similar global pattern of arm-level copy number changes and large variations of somatic single-nucleotide variant between the primary and metastasis are observed in the majority of cases. Importantly, we found breakage-fusion-bridge (BFB) cycles acting as an important mechanism for underlying cancer gene amplification, such as amplification of CDK4, CDKN3 and FGFR1 in early stage. We also identified recurrent focal amplification of gene CCNY derived from BFB in two metastatic tumors, but not in primary tumor. Clonal analysis of case 236T demonstrated that mutational processes are varying with tumor progression. Collectively, our data provide new insights into genetic diversity and potential therapeutic target in lung adenocarcinoma.

  16. Progress in Genome Editing Technology and Its Application in Plants

    OpenAIRE

    Zhang, Kai; Raboanatahiry, Nadia; Zhu, Bin; Li, Maoteng

    2017-01-01

    Genome editing technology (GET) is a versatile approach that has progressed rapidly as a mechanism to alter the genotype and phenotype of organisms. However, conventional genome modification using GET cannot satisfy current demand for high-efficiency and site-directed mutagenesis, retrofitting of artificial nucleases has developed into a new avenue within this field. Based on mechanisms to recognize target genes, newly-developed GETs can generally be subdivided into three cleavage systems, pr...

  17. 2013 Progress Report -- DOE Joint Genome Institute

    Energy Technology Data Exchange (ETDEWEB)

    None

    2013-11-01

    In October 2012, we introduced a 10-Year Strategic Vision [http://bit.ly/JGI-Vision] for the Institute. A central focus of this Strategic Vision is to bridge the gap between sequenced genomes and an understanding of biological functions at the organism and ecosystem level. This involves the continued massive-scale generation of sequence data, complemented by orthogonal new capabilities to functionally annotate these large sequence data sets. Our Strategic Vision lays out a path to guide our decisions and ensure that the evolving set of experimental and computational capabilities available to DOE JGI users will continue to enable groundbreaking science.

  18. Spectrogram Analysis of Genomes

    Directory of Open Access Journals (Sweden)

    David Sussillo

    2004-01-01

    Full Text Available We performed frequency-domain analysis in the genomes of various organisms using tricolor spectrograms, identifying several types of distinct visual patterns characterizing specific DNA regions. We relate patterns and their frequency characteristics to the sequence characteristics of the DNA. At times, the spectrogram patterns could be related to the structure of the corresponding protein region by using various public databases such as GenBank. Some patterns are explained from the biological nature of the corresponding regions, which relate to chromosome structure and protein coding, and some patterns have yet unknown biological significance. We found biologically meaningful patterns, on the scale of millions of base pairs, to a few hundred base pairs. Chromosome-wide patterns include periodicities ranging from 2 to 300. The color of the spectrogram depends on the nucleotide content at specific frequencies, and therefore can be used as a local indicator of CG content and other measures of relative base content. Several smaller-scale patterns were found to represent different types of domains made up of various tandem repeats.

  19. From genome-wide arrays to tailor-made biomarker readout - Progress towards routine analysis of skin sensitizing chemicals with GARD.

    Science.gov (United States)

    Forreryd, Andy; Zeller, Kathrin S; Lindberg, Tim; Johansson, Henrik; Lindstedt, Malin

    2016-12-01

    Allergic contact dermatitis (ACD) initiated by chemical sensitizers is an important public health concern. To prevent ACD, it is important to identify chemical allergens to limit the use of such compounds in various products. EU legislations, as well as increased mechanistic knowledge of skin sensitization have promoted development of non-animal based approaches for hazard classification of chemicals. GARD is an in vitro testing strategy based on measurements of a genomic biomarker signature. However, current GARD protocols are optimized for identification of predictive biomarker signatures, and not suitable for standardized screening. This study describes improvements to GARD to progress from biomarker discovery into a reliable and cost-effective assay for routine testing. Gene expression measurements were transferred to NanoString nCounter platform, normalization strategy was adjusted to fit serial arrival of testing substances, and a novel strategy to correct batch variations was presented. When challenging GARD with 29 compounds, sensitivity, specificity and accuracy could be estimated to 94%, 83% and 90%, respectively. In conclusion, we present a GARD workflow with improved sample capacity, retained predictive performance, and in a format adapted to standardized screening. We propose that GARD is ready to be considered as part of an integrated testing strategy for skin sensitization.

  20. Integrative bayesian network analysis of genomic data.

    Science.gov (United States)

    Ni, Yang; Stingo, Francesco C; Baladandayuthapani, Veerabhadran

    2014-01-01

    Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.

  1. Progress in the detection of human genome structural variations

    Institute of Scientific and Technical Information of China (English)

    WU XueMei; XIAO HuaSheng

    2009-01-01

    The emerging of high.throughput and high-resolution genomic technologies led to the detection of submicroscopic variants ranging from 1 kb to 3 Mb in the human genome. These variants include copy number variations (CNVs), inversions, insertions, deletions and other complex rearrangements of DNA sequences. This paper briefly reviews the commonly used technologies to discover both genomic structural variants and their potential influences. Particularly, we highlight the array-based, PCR-based and sequencing-based assays, including array-based comparative genomic hybridization (aCGH),representational oligonucleotide microarray analysis (ROMA), multiplex amplifiable probe hybridization (MAPH), multiplex ligation-dependent probe amplification (MLPA), paired-end mapping (PEM), and next-generation DNA sequencing technologies. Furthermore, we discuss the limitations and challenges of current assays and give advices on how to make the database of genomic variations more reliable.

  2. Progress in the detection of human genome structural variations

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    The emerging of high-throughput and high-resolution genomic technologies led to the detection of submicroscopic variants ranging from 1 kb to 3 Mb in the human genome.These variants include copy number variations(CNVs),inversions,insertions,deletions and other complex rearrangements of DNA sequences.This paper briefly reviews the commonly used technologies to discover both genomic structural variants and their potential influences.Particularly,we highlight the array-based,PCR-based and sequencing-based assays,including array-based comparative genomic hybridization(aCGH),representational oligonucleotide microarray analysis(ROMA),multiplex amplifiable probe hybridization(MAPH),multiplex ligation-dependent probe amplification(MLPA),paired-end mapping(PEM),and next-generation DNA sequencing technologies.Furthermore,we discuss the limitations and challenges of current assays and give advices on how to make the database of genomic variations more reliable.

  3. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  4. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2003-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  5. Genomic aberrations of myeloproliferative and myelodysplastic/myeloproliferative neoplasms in chronic phase and during disease progression.

    Science.gov (United States)

    Hahm, C; Huh, H J; Mun, Y C; Seong, C M; Chung, W S; Huh, J

    2015-04-01

    Myeloproliferative neoplasms (MPN) and myelodysplastic/myeloproliferative neoplasms (MDS/MPN) may transform into secondary myelofibrosis (MF) or evolve into acute myeloid leukemia (AML). The genetic mechanisms underlying disease progression in MPN and MDS/MPN patients remain unclear. The purpose of this study was to investigate sequential genomic aberrations identified by single nucleotide polymorphism array (SNP-A)-based karyotyping that can detect cryptic aberrations or copy neutral loss of heterozygosity (CN-LOH) in the chronic phase and during disease progression of MPN and MDS/MPN patients. The study group included 13 MPN and four MDS/MPN patients (seven polycythemia vera (PV); four essential thrombocythemia (ET); two MPN-unclassifiable (MPN-U); one chronic myelomonocytic leukemia (CMML); one atypical chronic myeloid leukemia, BCR-ABL1 negative (aCML); and two MDS/MPN-unclassifiable (MDS/MPN-U)). Among them, five patients (two PV, two MPN-U, and one MDS/MPN-U) progressed to MF and three patients (one CMML, one aCML, and one MDS/MPN-U) transformed to AML. The median follow-up period was 70 months (range, 7-152). Whole-genome SNP-A (SNP 6.0; Affymetrix, Santa Clara, CA, USA)-based karyotyping and JAK2 mutation analysis were performed according to the manufacturer's instructions. SNP-A showed 19 kinds of genomic aberrations, including seven gains, eight deletions, and four CN-LOH. CN-LOH of 9p involving JAK2 was the most common aberration, followed by 5q deletion and 9p gain. The incidence of genomic changes identified by SNP was not different in patients with disease progression (75%), compared with those without disease progression (56%) (P = 0.4). However, when excluding 9p CN-LOH, the incidence of genomic changes was significantly higher in patients with disease progression than in patients without disease progression (63% and 0%, respectively, P = 0.01). Among eight patients with disease progression, two patients (two MPN-U) showed abnormal SNP-A results

  6. The integrated microbial genome resource of analysis.

    Science.gov (United States)

    Checcucci, Alice; Mengoni, Alessio

    2015-01-01

    Integrated Microbial Genomes and Metagenomes (IMG) is a biocomputational system that allows to provide information and support for annotation and comparative analysis of microbial genomes and metagenomes. IMG has been developed by the US Department of Energy (DOE)-Joint Genome Institute (JGI). IMG platform contains both draft and complete genomes, sequenced by Joint Genome Institute and other public and available genomes. Genomes of strains belonging to Archaea, Bacteria, and Eukarya domains are present as well as those of viruses and plasmids. Here, we provide some essential features of IMG system and case study for pangenome analysis.

  7. Genomic analysis of hyperthermophilic archaea; Chokonetsusei kosaikin no genomu kaiseki

    Energy Technology Data Exchange (ETDEWEB)

    Kato, C. [Japan Marine Science and Technology Center, Kanagawa (Japan)

    1997-05-20

    Whole genome sequences of five strains of microorganisms have been reported up to the present and many genome analysis projects are in progress in the world. Among archaea (archaebacteria), the genome analysis of Methanococcus jannaschii have been completed and the sequencing data are opened to public. While 134 regulatory genes were identified in Synechocystis sp. PCC 6803 (eubacteria, 3.6 genome size), only 7 regulatory genes were identified in M. jannaschii (1.7Mb). Difference of the genome size is believed to correspond to the quantity of the environmental stresses. In Japan, the genome analysis project on a new hyperthermophilic archaeon, Pyrococcus horikoshii is in progress. P. horikoshii was isolated in a deep sea hydrothermal vent. It shows barophilic growth at maximum high temperature of 103degC under pressure of 30MPa. Thus, the genome analysis of barophilic hyperthermophilic archaea is expected to contribute to the understanding of the origin of life and evolution. 19 refs., 4 figs., 1 tab.

  8. Genome-wide association studies in asthma: progress and pitfalls

    Directory of Open Access Journals (Sweden)

    March ME

    2015-01-01

    Full Text Available Michael E March,1 Patrick MA Sleiman,1,2 Hakon Hakonarson1,2 1Center for Applied Genomics, Children's Hospital of Philadelphia Research Institute, 2Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA Abstract: Genetic studies of asthma have revealed that there is considerable heritability to the phenotype. An extensive history of candidate-gene studies has identified a long list of genes associated with immune function that are potentially involved in asthma pathogenesis. However, many of the results of candidate-gene studies have failed to be replicated, leaving in question the true impact of the implicated biological pathways on asthma. With the advent of genome-wide association studies, geneticists are able to examine the association of hundreds of thousands of genetic markers with a phenotype, allowing the hypothesis-free identification of variants associated with disease. Many such studies examining asthma or related phenotypes have been published, and several themes have begun to emerge regarding the biological pathways underpinning asthma. The results of many genome-wide association studies have currently not been replicated, and the large sample sizes required for this experimental strategy invoke difficulties with sample stratification and phenotypic heterogeneity. Recently, large collaborative groups of researchers have formed consortia focused on asthma, with the goals of sharing material and data and standardizing diagnosis and experimental methods. Additionally, research has begun to focus on genetic variants that affect the response to asthma medications and on the biology that generates the heterogeneity in the asthma phenotype. As this work progresses, it will move asthma patients closer to more specific, personalized medicine. Keywords: asthma, genetics, GWAS, pharmacogenetics, biomarkers

  9. Progress in Genome Editing Technology and Its Application in Plants

    Science.gov (United States)

    Zhang, Kai; Raboanatahiry, Nadia; Zhu, Bin; Li, Maoteng

    2017-01-01

    Genome editing technology (GET) is a versatile approach that has progressed rapidly as a mechanism to alter the genotype and phenotype of organisms. However, conventional genome modification using GET cannot satisfy current demand for high-efficiency and site-directed mutagenesis, retrofitting of artificial nucleases has developed into a new avenue within this field. Based on mechanisms to recognize target genes, newly-developed GETs can generally be subdivided into three cleavage systems, protein-dependent DNA cleavage systems (i.e., zinc-finger nucleases, ZFN, and transcription activator-like effector nucleases, TALEN), RNA-dependent DNA cleavage systems (i.e., clustered regularly interspaced short palindromic repeats-CRISPR associated proteins, CRISPR-Cas9, CRISPR-Cpf1, and CRISPR-C2c1), and RNA-dependent RNA cleavage systems (i.e., RNA interference, RNAi, and CRISPR-C2c2). All these techniques can lead to double-stranded (DSB) or single-stranded breaks (SSB), and result in either random mutations via non-homologous end-joining (NHEJ) or targeted mutation via homologous recombination (HR). Thus, site-directed mutagenesis can be induced via targeted gene knock-out, knock-in, or replacement to modify specific characteristics including morphology-modification, resistance-enhancement, and physiological mechanism-improvement along with plant growth and development. In this paper, an non-comprehensive review on the development of different GETs as applied to plants is presented. PMID:28261237

  10. Progress in Genome Editing Technology and Its Application in Plants.

    Science.gov (United States)

    Zhang, Kai; Raboanatahiry, Nadia; Zhu, Bin; Li, Maoteng

    2017-01-01

    Genome editing technology (GET) is a versatile approach that has progressed rapidly as a mechanism to alter the genotype and phenotype of organisms. However, conventional genome modification using GET cannot satisfy current demand for high-efficiency and site-directed mutagenesis, retrofitting of artificial nucleases has developed into a new avenue within this field. Based on mechanisms to recognize target genes, newly-developed GETs can generally be subdivided into three cleavage systems, protein-dependent DNA cleavage systems (i.e., zinc-finger nucleases, ZFN, and transcription activator-like effector nucleases, TALEN), RNA-dependent DNA cleavage systems (i.e., clustered regularly interspaced short palindromic repeats-CRISPR associated proteins, CRISPR-Cas9, CRISPR-Cpf1, and CRISPR-C2c1), and RNA-dependent RNA cleavage systems (i.e., RNA interference, RNAi, and CRISPR-C2c2). All these techniques can lead to double-stranded (DSB) or single-stranded breaks (SSB), and result in either random mutations via non-homologous end-joining (NHEJ) or targeted mutation via homologous recombination (HR). Thus, site-directed mutagenesis can be induced via targeted gene knock-out, knock-in, or replacement to modify specific characteristics including morphology-modification, resistance-enhancement, and physiological mechanism-improvement along with plant growth and development. In this paper, an non-comprehensive review on the development of different GETs as applied to plants is presented.

  11. Convergence of advances in genomics, team science, and repositories as drivers of progress in psychiatric genomics.

    Science.gov (United States)

    Lehner, Thomas; Senthil, Geetha; Addington, Anjené M

    2015-01-01

    After many years of unfilled promise, psychiatric genetics has seen an unprecedented number of successes in recent years. We hypothesize that the field has reached an inflection point through a confluence of four key developments: advances in genomics; the orientation of the scientific community around large collaborative team science projects; the development of sample and data repositories; and a policy framework for sharing and accessing these resources. We discuss these domains and their effect on scientific progress and provide a perspective on why we think this is only the beginning of a new era in scientific discovery.

  12. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  13. Electric fish genomics: Progress, prospects, and new tools for neuroethology.

    Science.gov (United States)

    Pitchers, William R; Constantinou, Savvas J; Losilla, Mauricio; Gallant, Jason R

    2016-10-01

    Electric fish have served as a model system in biology since the 18th century, providing deep insight into the nature of bioelectrogenesis, the molecular structure of the synapse, and brain circuitry underlying complex behavior. Neuroethologists have collected extensive phenotypic data that span biological levels of analysis from molecules to ecosystems. This phenotypic data, together with genomic resources obtained over the past decades, have motivated new and exciting hypotheses that position the weakly electric fish model to address fundamental 21(st) century biological questions. This review article considers the molecular data collected for weakly electric fish over the past three decades, and the insights that data of this nature has motivated. For readers relatively new to molecular genetics techniques, we also provide a table of terminology aimed at clarifying the numerous acronyms and techniques that accompany this field. Next, we pose a research agenda for expanding genomic resources for electric fish research over the next 10years. We conclude by considering some of the exciting research prospects for neuroethology that electric fish genomics may offer over the coming decades, if the electric fish community is successful in these endeavors. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. 78 FR 47674 - Genome in a Bottle Consortium-Progress and Planning Workshop

    Science.gov (United States)

    2013-08-06

    ... National Institute of Standards and Technology Genome in a Bottle Consortium--Progress and Planning... workshop. SUMMARY: NIST announces the Genome in a Bottle Consortium meeting to be held on Thursday and Friday, August 15 and 16, 2013. The Genome in a Bottle Consortium is developing the reference...

  15. Genome-wide association and linkage analyses localize a progressive retinal atrophy locus in Persian cats.

    Science.gov (United States)

    Alhaddad, Hasan; Gandolfi, Barbara; Grahn, Robert A; Rah, Hyung-Chul; Peterson, Carlyn B; Maggs, David J; Good, Kathryn L; Pedersen, Niels C; Lyons, Leslie A

    2014-08-01

    Hereditary eye diseases of animals serve as excellent models of human ocular disorders and assist in the development of gene and drug therapies for inherited forms of blindness. Several primary hereditary eye conditions affecting various ocular tissues and having different rates of progression have been documented in domestic cats. Gene therapy for canine retinopathies has been successful, thus the cat could be a gene therapy candidate for other forms of retinal degenerations. The current study investigates a hereditary, autosomal recessive, retinal degeneration specific to Persian cats. A multi-generational pedigree segregating for this progressive retinal atrophy was genotyped using a 63 K SNP array and analyzed via genome-wide linkage and association methods. A multi-point parametric linkage analysis localized the blindness phenotype to a ~1.75 Mb region with significant LOD scores (Z ≈ 14, θ = 0.00) on cat chromosome E1. Genome-wide TDT, sib-TDT, and case-control analyses also consistently supported significant association within the same region on chromosome E1, which is homologous to human chromosome 17. Using haplotype analysis, a ~1.3 Mb region was identified as highly associated for progressive retinal atrophy in Persian cats. Several candidate genes within the region are reasonable candidates as a potential causative gene and should be considered for molecular analyses.

  16. A novel statistic for genome-wide interaction analysis.

    Science.gov (United States)

    Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao

    2010-09-23

    Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDRanalysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

  17. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  18. Comparative genomic analysis of esophageal cancers.

    Science.gov (United States)

    Caygill, Christine P J; Gatenby, Piers A C; Herceg, Zdenko; Lima, Sheila C S; Pinto, Luis F R; Watson, Anthony; Wu, Ming-Shiang

    2014-09-01

    The following, from the 12th OESO World Conference: Cancers of the Esophagus, includes commentaries on comparative genomic analysis of esophageal cancers: genomic polymorphisms, the genetic and epigenetic drivers in esophageal cancers, and the collection of data in the UK Barrett's Oesophagus Registry.

  19. Structural and functional analysis of rice genome

    Indian Academy of Sciences (India)

    Akhilesh K. Tyagi; Jitendra P. Khurana; Paramjit Khurana; Saurabh Raghuvanshi; Anupama Gaur; Anita Kapur; Vikrant Gupta; Dibyendu Kumar; V. Ravi; Shubha Vij; Parul Khurana; Sulabha Sharma

    2004-04-01

    Rice is an excellent system for plant genomics as it represents a modest size genome of 430 Mb. It feeds more than half the population of the world. Draft sequences of the rice genome, derived by whole-genome shotgun approach at relatively low coverage (4–6 X), were published and the International Rice Genome Sequencing Project (IRGSP) declared high quality (>10 X), genetically anchored, phase 2 level sequence in 2002. In addition, phase 3 level finished sequence of chromosomes 1, 4 and 10 (out of 12 chromosomes of rice) has already been reported by scientists from IRGSP consortium. Various estimates of genes in rice place the number at > 50,000. Already, over 28,000 full-length cDNAs have been sequenced, most of which map to genetically anchored genome sequence. Such information is very useful in revealing novel features of macro- and micro-level synteny of rice genome with other cereals. Microarray analysis is unraveling the identity of rice genes expressing in temporal and spatial manner and should help target candidate genes useful for improving traits of agronomic importance. Simultaneously, functional analysis of rice genome has been initiated by marker-based characterization of useful genes and employing functional knock-outs created by mutation or gene tagging. Integration of this enormous information is expected to catalyze tremendous activity on basic and applied aspects of rice genomics.

  20. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    Paola eCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract.As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  1. BioMet Toolbox: genome-wide analysis of metabolism

    OpenAIRE

    Cvijovic, M.; R. Olivares-Hernandez; Agren, R.; Dahr, N.; Vongsangnak, W.; Nookaew, I.; K. R. Patil; Nielsen, J.

    2010-01-01

    The rapid progress of molecular biology tools for directed genetic modifications, accurate quantitative experimental approaches, high-throughput measurements, together with development of genome sequencing has made the foundation for a new area of metabolic engineering that is driven by metabolic models. Systematic analysis of biological processes by means of modelling and simulations has made the identification of metabolic networks and prediction of metabolic capabilities under different co...

  2. A novel statistic for genome-wide interaction analysis.

    Directory of Open Access Journals (Sweden)

    Xuesen Wu

    2010-09-01

    Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001genome-wide interaction analysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

  3. Whole genome analysis of a Vietnamese trio

    Indian Academy of Sciences (India)

    Dang Thanh Hai; Nguyen Dai Thanh; Pham Thi Minh Trang; Le Si Quang; Phan Thi Thu Hang; Dang Cao Cuong; Hoang Kim Phuc; Nguyen Huu Duc; Do Duc Dong; Bui Quang Minh; Pham Bao Son; Le Sy Vinh

    2015-03-01

    We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.

  4. Research progress of genome editing and derivative technologies in plants.

    Science.gov (United States)

    Qiwei, Shan; Caixia, Gao

    2015-10-01

    Genome editing technologies using engineered nucleases have been widely used in many model organisms. Genome editing with sequence-specific nuclease (SSN) creates DNA double-strand breaks (DSBs) in the genomic target sites that are primarily repaired by the non-homologous end joining (NHEJ) or homologous recombination (HR) pathways, which can be employed to achieve targeted genome modifications such as gene mutations, insertions, replacements or chromosome rearrangements. There are three major SSNs─zinc finger nuclease (ZFN), transcription activator-like effector nuclease (TALEN) and clustered regularly interspaced short palindromic repeats/CRISPR-associated 9 (CRISPR/Cas9) system. In contrast to ZFN and TALEN, which require substantial protein engineering to each DNA target, the CRISPR/Cas9 system requires only a change in the guide RNA. For this reason, the CRISPR/Cas9 system is a simple, inexpensive and versatile tool for genome engineering. Furthermore, a modified version of the CRISPR/Cas9 system has been developed to recruit heterologous domains that can regulate endogenous gene expression, such as activation, depression and epigenetic regulation. In this review, we summarize the development and applications of genome editing technologies for basic research and biotechnology, as well as highlight challenges and future directions, with particular emphasis on plants.

  5. Comparative genomic analysis of eutherian kallikrein genes

    Directory of Open Access Journals (Sweden)

    Marko Premzl

    2017-03-01

    Full Text Available The present study made attempts to update and revise eutherian kallikrein genes implicated in major physiological and pathological processes and in medical molecular diagnostics. Using eutherian comparative genomic analysis protocol and free available genomic sequence assemblies, the tests of reliability of eutherian public genomic sequences annotated most comprehensive curated third party data gene data set of eutherian kallikrein genes including 121 complete coding sequences among 335 potential coding sequences. The present analysis first described 13 major gene clusters of eutherian kallikrein genes, and explained their differential gene expansion patterns. One updated classification and nomenclature of eutherian kallikrein genes was proposed, as new framework of future experiments.

  6. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  7. Recurrent genomic alterations in sequential progressive leukoplakia and oral cancer: drivers of oral tumorigenesis?

    Science.gov (United States)

    Cervigne, Nilva K; Machado, Jerry; Goswami, Rashmi S; Sadikovic, Bekim; Bradley, Grace; Perez-Ordonez, Bayardo; Galloni, Natalie Naranjo; Gilbert, Ralph; Gullane, Patrick; Irish, Jonathan C; Jurisica, Igor; Reis, Patricia P; Kamel-Reid, Suzanne

    2014-05-15

    A significant proportion (up to 62%) of oral squamous cell carcinomas (OSCCs) may arise from oral potential malignant lesions (OPMLs), such as leukoplakia. Patient outcomes may thus be improved through detection of lesions at a risk for malignant transformation, by identifying and categorizing genetic changes in sequential, progressive OPMLs. We conducted array comparative genomic hybridization analysis of 25 sequential, progressive OPMLs and same-site OSCCs from five patients. Recurrent DNA copy number gains were identified on 1p in 20/25 cases (80%) with minimal, high-level amplification regions on 1p35 and 1p36. Other regions of gains were frequently observed: 11q13.4 (68%), 9q34.13 (64%), 21q22.3 (60%), 6p21 and 6q25 (56%) and 10q24, 19q13.2, 22q12, 5q31.2, 7p13, 10q24 and 14q22 (48%). DNA losses were observed in >20% of samples and mainly detected on 5q31.2 (35%), 16p13.2 (30%), 9q33.1 and 9q33.29 (25%) and 17q11.2, 3p26.2, 18q21.1, 4q34.1 and 8p23.2 (20%). Such copy number alterations (CNAs) were mapped in all grades of dysplasia that progressed, and their corresponding OSCCs, in 70% of patients, indicating that these CNAs may be associated with disease progression. Amplified genes mapping within recurrent CNAs (KHDRBS1, PARP1, RAB1A, HBEGF, PAIP2, BTBD7) were selected for validation, by quantitative real-time PCR, in an independent set of 32 progressive leukoplakia, 32 OSSCs and 21 non-progressive leukoplakia samples. Amplification of BTBD7, KHDRBS1, PARP1 and RAB1A was exclusively detected in progressive leukoplakia and corresponding OSCC. BTBD7, KHDRBS1, PARP1 and RAB1A may be associated with OSCC progression. Protein-protein interaction networks were created to identify possible pathways associated with OSCC progression.

  8. From genome to proteome: great progress in the domesticated silkworm (Bombyx mori L.)

    Institute of Scientific and Technical Information of China (English)

    Zhonghua Zhou; Huijuan Yang; Boxiong Zhong

    2008-01-01

    As the only truly domesticated insect,the silkworm not only has great economic value,but it also has value as a model for genetics and molecular biology research.Genomics and proteomics have recently shown vast potential to be essential tools in domesticated silkworm research,especially after the completion of the Bombyx mori genome sequence.This paper reviews the progress of the domesticated silkworm genome,particularly focusing on its genetic map,physical map and functional genome.This review also presents proteomics,the proteomic technique and its application in silkworm research.

  9. DNA sequencing leads to genomics progress in China

    Institute of Scientific and Technical Information of China (English)

    WU JiaYan; XIAO JingFa; ZHANG RuoSi; YU Jun

    2011-01-01

    1 Science in the large-scale sequencing era Ten years ago,the first draft sequence assembly of the human genome was completed [1],bringing biomedical research one-step closer toward the goal of revolutionizing diagnosis,prevention,and treatment of human diseases.Recently,journalists from the journal Nature surveyed more than 1000 life scientists regarding this laudable aim [2],obtaining substantially negative responses [3].However,almost all of those surveyed had been influenced,in one way or another,by the availability of the human genome sequence,and they also agreed with the notion that the "sequence is the start." The complexity of genome biology and almost every aspect of human biology is far greater than previously thought [4].

  10. Comparative genomics of Bordetella pertussis reveals progressive gene loss in Finnish strains.

    Directory of Open Access Journals (Sweden)

    Eriikka Heikkinen

    Full Text Available BACKGROUND: Bordetella pertussis is a gram-negative bacterium that infects the human respiratory tract and causes pertussis or whooping cough. The disease has resurged in many countries including Finland where the whole-cell pertussis vaccine has been used for more than 50 years. Antigenic divergence has been observed between vaccine strains and clinical isolates in Finland. To better understand genome evolution in B. pertussis circulating in the immunized population, we developed an oligonucleotide-based microarray for comparative genomic analysis of Finnish strains isolated during the period of 50 years. METHODOLOGY/PRINCIPAL FINDINGS: The microarray consisted of 3,582 oligonucleotides (70-mer and covered 94% of 3,816 ORFs of Tohama I, the strain of which the genome has been sequenced. Twenty isolates from 1953 to 2004 were studied together with two Finnish vaccine strains and two international reference strains. The isolates were selected according to their characteristics, e.g. the year and place of isolation and pulsed-field gel electrophoresis profiles. Genomic DNA of the tested strains, along with reference DNA of Tohama I strain, was labelled and hybridized. The absence of genes as established with microarrays, was confirmed by PCR. Compared with the Tohama I strain, Finnish isolates lost 7 (8.6 kb to 49 (55.3 kb genes, clustered in one to four distinct loci. The number of lost genes increased with time, and one third of lost genes had functions related to inorganic ion transport and metabolism, or energy production and conversion. All four loci of lost genes were flanked by the insertion sequence element IS481. CONCLUSION/SIGNIFICANCE: Our results showed that the progressive gene loss occurred in Finnish B. pertussis strains isolated during a period of 50 years and confirmed that B. pertussis is dynamic and is continuously evolving, suggesting that the bacterium may use gene loss as one strategy to adapt to highly immunized populations.

  11. Mathematical Analysis of Genomic Evolution

    Directory of Open Access Journals (Sweden)

    Cedric Green

    2011-01-01

    Full Text Available Changes in nucleotide sequences, or mutations, accumulate from generation to generation in the genomes of all living organisms. The mutations can be advantageous, deleterious, or neutral. The goal of this project is to determine the amount of advantageous mutations it takes to get human (Homo sapiens DNA from the DNA of genetically distinct organisms. We do this by collecting the genomic data of such organisms, and estimating the amount of mutations it takes to transform yeast (Saccharomyces cerevisiae DNA to the DNA of a human. We calculate the typical number of mutations occurring annually through the organism's average life span and the average mutation rate. This allows us to determine the total number of mutations as well as the probability of advantageous mutations. Not surprisingly, this probability proves to be fairly small. A more precise estimate can be determined by accounting for the differences in the chromosomal structure and phenomena like horizontal gene transfer.

  12. Research progress of plant population genomics based on high-throughput sequencing.

    Science.gov (United States)

    Yunsheng, Wang

    2016-08-01

    Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.

  13. Research progress in genomics of environmental and industrial microorganisms

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    Microbes contribute to geochemical cycles in the ecosystem.They also play important roles in biodegradation and bioremediation of contaminated environments,and have great potential in energy conversion and regeneration.Up to date,at least 150 genomes of non-pathogenic microbes have been sequenced,of which,the majority are bacteria from various environments or of industrial uses.The emerging field ’metagenomics’ in combination with the high-throughput sequencing technology offers opportunities to discover new functions of microbes in the environment on a large scale,and has become the ’hot spot’ in the field of environmental microbiology.Seven genomes of bacteria from various extreme environments,including high temperature,high and low pressure,and extreme acidic regions,have been sequenced by researchers in China,leading to the discovery of metabolic pathways,genetic functions and new enzymes,which are related to the niches those bacteria occupy.These results were published in Nature,PNAS,Genome Research and other top international journals.In the meantime,several groups in China have started ’metagenomics’ programs.The outcomes of these researches are expected to generate a considerable number of novel findings,taking Chinese researchers to the frontier of genomics for environmental and industrial microorganisms.

  14. Research progress in genomics of environmental and industrial microorganisms

    Institute of Scientific and Technical Information of China (English)

    WANG Lei; LIU Bin; ZHOU ZheMin

    2009-01-01

    Microbes contribute to geochemical cycles in the ecosystem. They also play important roles in bio-degradation and bioremediation of contaminated environments, and have great potential in energy conversion and regeneration. Up to date, at least 150 genomes of non-pathogenic microbes have been sequenced, of which, the majority are bacteria from various environments or of industrial uses. The emerging field 'metagenomics' in combination with the high-throughput sequencing technology offers opportunities to discover new functions of microbes in the environment on a large scale, and has be-come the 'hot spot' in the field of environmental microbiology. Seven genomes of bacteria from various extreme environments, including high temperature, high and low pressure, and extreme acidic regions, have been sequenced by researchers in China, leading to the discovery of metabolic pathways, genetic functions and new enzymes, which are related to the niches those bacteria occupy. These results were published in Nature, PNAS, Genome Research and other top international journals. In the meantime, several groups in China have started 'metagenomics' programs. The outcomes of these researches are expected to generate a considerable number of novel findings, taking Chinese researchers to the fron-tier of genomics for environmental and industrial microorganisms.

  15. Comparative genomic analysis of sixty mycobacteriophage genomes: Genome clustering, gene acquisition and gene size

    Science.gov (United States)

    Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.

    2010-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525

  16. A Distance Measure for Genome Phylogenetic Analysis

    Science.gov (United States)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  17. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  18. ETS-Associated Genomic Alterations including ETS2 Loss Markedly Affect Prostate Cancer Progression

    Science.gov (United States)

    2015-10-01

    AWARD NUMBER: W81XWH-13-1-0385 TITLE: ETS -Associated Genomic Alterations including ETS2 Loss Markedly Affect Prostate Cancer Progression...29 Sep 2015 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER W81XWH-13-1-0385 ETS -Associated Genomic Alterations including ETS2 Loss Markedly Affect...upregulation of ERG, a transcription factor with oncogenic roles in other cancers such as leukemias and sarcomas (Tomlins, Rhodes et al. 2005; Turner

  19. Appearance traits in fish farming: progress from classical genetics to genomics, providing insight into current and potential genetic improvement

    Science.gov (United States)

    Colihueque, Nelson; Araneda, Cristian

    2014-01-01

    Appearance traits in fish, those external body characteristics that influence consumer acceptance at point of sale, have come to the forefront of commercial fish farming, as culture profitability is closely linked to management of these traits. Appearance traits comprise mainly body shape and skin pigmentation. Analysis of the genetic basis of these traits in different fish reveals significant genetic variation within populations, indicating potential for their genetic improvement. Work into ascertaining the minor or major genes underlying appearance traits for commercial fish is emerging, with substantial progress in model fish in terms of identifying genes that control body shape and skin colors. In this review, we describe research progress to date, especially with regard to commercial fish, and discuss genomic findings in model fish in order to better address the genetic basis of the traits. Given that appearance traits are important in commercial fish, the genomic information related to this issue promises to accelerate the selection process in coming years. PMID:25140172

  20. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  1. AcCNET (Accessory Genome Constellation Network): comparative genomics software for accessory genome analysis using bipartite networks.

    Science.gov (United States)

    Lanza, Val F; Baquero, Fernando; de la Cruz, Fernando; Coque, Teresa M

    2017-01-15

    AcCNET (Accessory genome Constellation Network) is a Perl application that aims to compare accessory genomes of a large number of genomic units, both at qualitative and quantitative levels. Using the proteomes extracted from the analysed genomes, AcCNET creates a bipartite network compatible with standard network analysis platforms. AcCNET allows merging phylogenetic and functional information about the concerned genomes, thus improving the capability of current methods of network analysis. The AcCNET bipartite network opens a new perspective to explore the pangenome of bacterial species, focusing on the accessory genome behind the idiosyncrasy of a particular strain and/or population.

  2. Molecular genetics and genomics progress in urothelial bladder cancer.

    Science.gov (United States)

    Netto, George J

    2013-11-01

    The clinical management of solid tumor patients has recently undergone a paradigm shift as the result of the accelerated advances in cancer genetics and genomics. Molecular diagnostics is now an integral part of routine clinical management in lung, colon, and breast cancer patients. In a disappointing contrast, molecular biomarkers remain largely excluded from current management algorithms of urologic malignancies. The need for new treatment alternatives and validated prognostic molecular biomarkers that can help clinicians identify patients in need of early aggressive management is pressing. Identifying robust predictive biomarkers that can stratify response to newly introduced targeted therapeutics is another crucially needed development. The following is a brief discussion of some promising candidate biomarkers that may soon become a part of clinical management of bladder cancers.

  3. AGAPE (Automated Genome Analysis PipelinE for pan-genome analysis of Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Giltae Song

    Full Text Available The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  4. AGAPE (Automated Genome Analysis PipelinE) for pan-genome analysis of Saccharomyces cerevisiae.

    Science.gov (United States)

    Song, Giltae; Dickins, Benjamin J A; Demeter, Janos; Engel, Stacia; Gallagher, Jennifer; Choe, Kisurb; Dunn, Barbara; Snyder, Michael; Cherry, J Michael

    2015-01-01

    The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  5. [Visual field progression in glaucoma: cluster analysis].

    Science.gov (United States)

    Bresson-Dumont, H; Hatton, J; Foucher, J; Fonteneau, M

    2012-11-01

    Visual field progression analysis is one of the key points in glaucoma monitoring, but distinction between true progression and random fluctuation is sometimes difficult. There are several different algorithms but no real consensus for detecting visual field progression. The trend analysis of global indices (MD, sLV) may miss localized deficits or be affected by media opacities. Conversely, point-by-point analysis makes progression difficult to differentiate from physiological variability, particularly when the sensitivity of a point is already low. The goal of our study was to analyse visual field progression with the EyeSuite™ Octopus Perimetry Clusters algorithm in patients with no significant changes in global indices or worsening of the analysis of pointwise linear regression. We analyzed the visual fields of 162 eyes (100 patients - 58 women, 42 men, average age 66.8 ± 10.91) with ocular hypertension or glaucoma. For inclusion, at least six reliable visual fields per eye were required, and the trend analysis (EyeSuite™ Perimetry) of visual field global indices (MD and SLV), could show no significant progression. The analysis of changes in cluster mode was then performed. In a second step, eyes with statistically significant worsening of at least one of their clusters were analyzed point-by-point with the Octopus Field Analysis (OFA). Fifty four eyes (33.33%) had a significant worsening in some clusters, while their global indices remained stable over time. In this group of patients, more advanced glaucoma was present than in stable group (MD 6.41 dB vs. 2.87); 64.82% (35/54) of those eyes in which the clusters progressed, however, had no statistically significant change in the trend analysis by pointwise linear regression. Most software algorithms for analyzing visual field progression are essentially trend analyses of global indices, or point-by-point linear regression. This study shows the potential role of analysis by clusters trend. However, for best

  6. Genome-wide gene expression profiling of testicular carcinoma in situ progression into overt tumours

    DEFF Research Database (Denmark)

    Almstrup, K; Hoei-Hansen, C E; Nielsen, J E

    2005-01-01

    into CIS occurs early during foetal life. Progression into an overt tumour, however, typically first happens after puberty, where CIS cells transform into either a seminoma (SEM) or a nonseminoma (N-SEM). Here, we have compared the genome-wide gene expression of CIS cells to that of testicular SEM...

  7. Technology-Driven and Evidence-Based Genomic Analysis for Integrated Pediatric and Prenatal Genetics Evaluation

    Institute of Scientific and Technical Information of China (English)

    Yuan Wei; Fang Xu; Peining Li

    2013-01-01

    The first decade since the completion of the Human Genome Project has been marked with rapid development of genomic technologies and their immediate clinical applications.Genomic analysis using oligonucleotide array comparative genomic hybridization (aCGH) or single nucleotide polymorphism (SNP) chips has been applied to pediatric patients with developmental and intellectual disabilities (DD/ID),multiple congenital anomalies (MCA) and autistic spectrum disorders (ASD).Evaluation of analytical and clinical validities of aCGH showed > 99% sensitivity and specificity and increased analytical resolution by higher density probe coverage.Reviews of case series,multi-center comparison and large patient-control studies demonstrated a diagnostic yield of 12%-20%; approximately 60% of these abnormalities were recurrent genomic disorders.This pediatric experience has been extended toward prenatal diagnosis.A series of reports indicated approximately 10% of pregnancies with ultrasound-detected structural anomalies and normal cytogenetic findings had genomic abnormalities,and 30% of these abnormalities were syndromic genomic disorders.Evidence-based practice guidelines and standards for implementing genomic analysis and web-delivered knowledge resources for interpreting genomic findings have been established.The progress from this technology-driven and evidence-based genomic analysis provides not only opportunities to dissect disease-causing mechanisms and develop rational therapeutic interventions but also important lessons for integrating genomic sequencing into pediatric and prenatal genetic evaluation.

  8. Comparative genome analysis of Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  9. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  10. 2012 U.S. Department of Energy: Joint Genome Institute: Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, David [DOE JGI Public Affairs Manager

    2013-01-01

    The mission of the U.S. Department of Energy Joint Genome Institute (DOE JGI) is to serve the diverse scientific community as a user facility, enabling the application of large-scale genomics and analysis of plants, microbes, and communities of microbes to address the DOE mission goals in bioenergy and the environment. The DOE JGI's sequencing efforts fall under the Eukaryote Super Program, which includes the Plant and Fungal Genomics Programs; and the Prokaryote Super Program, which includes the Microbial Genomics and Metagenomics Programs. In 2012, several projects made news for their contributions to energy and environment research.

  11. Glaucoma Monitoring in a Clinical Setting Glaucoma Progression Analysis vs Nonparametric Progression Analysis in the Groningen Longitudinal Glaucoma Study

    NARCIS (Netherlands)

    Wesselink, Christiaan; Heeg, Govert P.; Jansonius, Nomdo M.

    Objective: To compare prospectively 2 perimetric progression detection algorithms for glaucoma, the Early Manifest Glaucoma Trial algorithm (glaucoma progression analysis [GPA]) and a nonparametric algorithm applied to the mean deviation (MD) (nonparametric progression analysis [NPA]). Methods:

  12. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    and dhurrin, which have not previously been characterized in blueberries. There are more than 44,500 spider species with distinct habitats and unique characteristics. Spiders are masters of producing silk webs to catch prey and using venom to neutralize. The exploration of the genetics behind these properties...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... has just started. We have assembled and annotated the first two spider genomes to facilitate our understanding of spiders at the molecular level. The need for analyzing the large and increasing amount of sequencing data has increased the demand for efficient, user friendly, and broadly applicable...

  13. e-Fungi: a data resource for comparative analysis of fungal genomes

    Directory of Open Access Journals (Sweden)

    Hubbard Simon J

    2007-11-01

    Full Text Available Abstract Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the

  14. Comparative genomic analysis of soybean flowering genes.

    Directory of Open Access Journals (Sweden)

    Chol-Hee Jung

    Full Text Available Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant

  15. Genome-wide association study of corticobasal degeneration identifies risk variants shared with progressive supranuclear palsy

    Science.gov (United States)

    Kouri, Naomi; Ross, Owen A.; Dombroski, Beth; Younkin, Curtis S.; Serie, Daniel J.; Soto-Ortolaza, Alexandra; Baker, Matthew; Finch, Ni Cole A.; Yoon, Hyejin; Kim, Jungsu; Fujioka, Shinsuke; McLean, Catriona A.; Ghetti, Bernardino; Spina, Salvatore; Cantwell, Laura B.; Farlow, Martin R.; Grafman, Jordan; Huey, Edward D.; Ryung Han, Mi; Beecher, Sherry; Geller, Evan T.; Kretzschmar, Hans A.; Roeber, Sigrun; Gearing, Marla; Juncos, Jorge L.; Vonsattel, Jean Paul G.; Van Deerlin, Vivianna M.; Grossman, Murray; Hurtig, Howard I.; Gross, Rachel G.; Arnold, Steven E.; Trojanowski, John Q.; Lee, Virginia M.; Wenning, Gregor K.; White, Charles L.; Höglinger, Günter U.; Müller, Ulrich; Devlin, Bernie; Golbe, Lawrence I.; Crook, Julia; Parisi, Joseph E.; Boeve, Bradley F.; Josephs, Keith A.; Wszolek, Zbigniew K.; Uitti, Ryan J.; Graff-Radford, Neill R.; Litvan, Irene; Younkin, Steven G.; Wang, Li-San; Ertekin-Taner, Nilüfer; Rademakers, Rosa; Hakonarsen, Hakon; Schellenberg, Gerard D.; Dickson, Dennis W.

    2015-01-01

    Corticobasal degeneration (CBD) is a neurodegenerative disorder affecting movement and cognition, definitively diagnosed only at autopsy. Here, we conduct a genome-wide association study (GWAS) in CBD cases (n=152) and 3,311 controls, and 67 CBD cases and 439 controls in a replication stage. Associations with meta-analysis were 17q21 at MAPT (P=1.42 × 10−12), 8p12 at lnc-KIF13B-1, a long non-coding RNA (rs643472; P=3.41 × 10−8), and 2p22 at SOS1 (rs963731; P=1.76 × 10−7). Testing for association of CBD with top progressive supranuclear palsy (PSP) GWAS single-nucleotide polymorphisms (SNPs) identified associations at MOBP (3p22; rs1768208; P=2.07 × 10−7) and MAPT H1c (17q21; rs242557; P=7.91 × 10−6). We previously reported SNP/transcript level associations with rs8070723/MAPT, rs242557/MAPT, and rs1768208/MOBP and herein identified association with rs963731/SOS1. We identify new CBD susceptibility loci and show that CBD and PSP share a genetic risk factor other than MAPT at 3p22 MOBP (myelin-associated oligodendrocyte basic protein). PMID:26077951

  16. PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

    Science.gov (United States)

    Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

    2016-01-01

    PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.

  17. Progress in the NNPDF global analysis

    CERN Document Server

    Deans, Christopher S

    2013-01-01

    We report on recent progress in the NNPDF framework of global PDF analysis. The NNPDF2.3 set is the first and only available PDF set with includes LHC data. A recent benchmark comparison of NNPDF2.3 and all other modern NNLO PDF sets with LHC data was performed. We have also studied theoretical uncertainties due to heavy quark renormalization schemes, higher twists and deuterium corrections in PDFs. Finally, we report on the release of positive definite PDF sets, based on the NNPDF2.3 analysis, specially suited for use in Monte Carlo event generators.

  18. The Analysis of Thematic Progression Patterns of English Advertisement

    Institute of Scientific and Technical Information of China (English)

    徐倩; 郭鸿雁

    2014-01-01

    Thematic Progression Patterns are the principal base for English advertisement analysis. Nowadays, it has attracted many experts in this field. Thematic Progression plays very important roles in the creation, development and establishment of English advertisement. This paper introduces four main types of Thematic Progression patterns and the analysis of English adver-tisement from Thematic Progression perspective.

  19. Detection of early glaucomatous progression with octopus cluster trend analysis.

    Science.gov (United States)

    Naghizadeh, Farzaneh; Holló, Gábor

    2014-01-01

    To compare the ability of Corrected Cluster Trend Analysis (CCTA) and Cluster Trend Analysis (CTA) with event analysis of Octopus visual field series to detect early glaucomatous progression. One eye of 15 healthy, 19 ocular hypertensive, 20 preperimetric, and 51 perimetric glaucoma (PG) patients were investigated with Octopus normal G2 test at 6-month intervals for 1.5 to 3 years. Progression was defined with significant worsening in any of the 10 Octopus clusters with CCTA, and event analysis criteria, respectively. With event analysis, 9 PG eyes showed localized progression and 1 diffuse mean defect (MD) worsening. With CCTA, progression was indicated in 1 normal, 1 ocular hypertensive, and 1 preperimetric glaucoma eyes due to vitreous floaters, and 28 PG eyes including all 9 eyes with localized progression with event analysis. The locations of CCTA progression matched those found with event analysis in all 9 cases. In 17 of the remaining 19 eyes, progressing clusters matched the locations that were suspicious but not definitive for progression with event analysis. In the eye with diffuse MD worsening, CTA found significant progression for 7 clusters. For global MD progression rate, eyes worsened with CCTA only did not differ from the stable eyes but had significantly smaller progression rates than the eyes progressed with event analysis (P=0.0002). In PG, Octopus CCTA and CTA are clinically useful to identify early progression and areas suspicious for early progression. However, in some eyes with no glaucomatous visual field damage, vitreous floaters may cause progression artifacts.

  20. Chromosomal imbalances in nasopharyngeal carcinoma: a meta-analysis of comparative genomic hybridization results

    Directory of Open Access Journals (Sweden)

    Jin Ping

    2006-01-01

    Full Text Available Abstract Nasopharyngeal carcinoma (NPC is a highly prevalent disease in Southeast Asia and its prevalence is clearly affected by genetic background. Various theories have been suggested for its high incidence in this geographical region but to these days no conclusive explanation has been identified. Chromosomal imbalances identifiable through comparative genomic hybridization may shed some light on common genetic alterations that may be of relevance to the onset and progression of NPC. Review of the literature, however, reveals contradictory results among reported findings possibly related to factors associated with patient selection, stage of disease, differences in methodological details etc. To increase the power of the analysis and attempt to identify commonalities among the reported findings, we performed a meta-analysis of results described in NPC tissues based on chromosomal comparative genomic hybridization (CGH. This meta-analysis revealed consistent patters in chromosomal abnormalities that appeared to cluster in specific "hot spots" along the genome following a stage-dependent progression.

  1. Phosphorylation of EB2 by Aurora B and CDK1 ensures mitotic progression and genome stability.

    Science.gov (United States)

    Iimori, Makoto; Watanabe, Sugiko; Kiyonari, Shinichi; Matsuoka, Kazuaki; Sakasai, Ryo; Saeki, Hiroshi; Oki, Eiji; Kitao, Hiroyuki; Maehara, Yoshihiko

    2016-03-31

    Temporal regulation of microtubule dynamics is essential for proper progression of mitosis and control of microtubule plus-end tracking proteins by phosphorylation is an essential component of this regulation. Here we show that Aurora B and CDK1 phosphorylate microtubule end-binding protein 2 (EB2) at multiple sites within the amino terminus and a cluster of serine/threonine residues in the linker connecting the calponin homology and end-binding homology domains. EB2 phosphorylation, which is strictly associated with mitotic entry and progression, reduces the binding affinity of EB2 for microtubules. Expression of non-phosphorylatable EB2 induces stable kinetochore microtubule dynamics and delays formation of bipolar metaphase plates in a microtubule binding-dependent manner, and leads to aneuploidy even in unperturbed mitosis. We propose that Aurora B and CDK1 temporally regulate the binding affinity of EB2 for microtubules, thereby ensuring kinetochore microtubule dynamics, proper mitotic progression and genome stability.

  2. Human and mouse genome analysis using array comparative genomic hybridization

    NARCIS (Netherlands)

    Snijders, Antoine Maria

    2004-01-01

    Almost all human cancers as well as developmental abnormalities are characterized by the presence of genetic alterations, most of which target a gene or a particular genomic locus resulting in altered gene expression and ultimately an altered phenotype. Different types of genetic alterations include

  3. Genome-wide analysis correlates Ayurveda Prakriti.

    Science.gov (United States)

    Govindaraj, Periyasamy; Nizamuddin, Sheikh; Sharath, Anugula; Jyothi, Vuskamalla; Rotti, Harish; Raval, Ritu; Nayak, Jayakrishna; Bhat, Balakrishna K; Prasanna, B V; Shintre, Pooja; Sule, Mayura; Joshi, Kalpana S; Dedge, Amrish P; Bharadwaj, Ramachandra; Gangadharan, G G; Nair, Sreekumaran; Gopinath, Puthiya M; Patwardhan, Bhushan; Kondaiah, Paturu; Satyamoorthy, Kapaettu; Valiathan, Marthanda Varma Sankaran; Thangaraj, Kumarasamy

    2015-10-29

    The practice of Ayurveda, the traditional medicine of India, is based on the concept of three major constitutional types (Vata, Pitta and Kapha) defined as "Prakriti". To the best of our knowledge, no study has convincingly correlated genomic variations with the classification of Prakriti. In the present study, we performed genome-wide SNP (single nucleotide polymorphism) analysis (Affymetrix, 6.0) of 262 well-classified male individuals (after screening 3416 subjects) belonging to three Prakritis. We found 52 SNPs (p ≤ 1 × 10(-5)) were significantly different between Prakritis, without any confounding effect of stratification, after 10(6) permutations. Principal component analysis (PCA) of these SNPs classified 262 individuals into their respective groups (Vata, Pitta and Kapha) irrespective of their ancestry, which represent its power in categorization. We further validated our finding with 297 Indian population samples with known ancestry. Subsequently, we found that PGM1 correlates with phenotype of Pitta as described in the ancient text of Caraka Samhita, suggesting that the phenotypic classification of India's traditional medicine has a genetic basis; and its Prakriti-based practice in vogue for many centuries resonates with personalized medicine.

  4. BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment.

    Science.gov (United States)

    Boel, Annekatrien; Steyaert, Woutert; De Rocker, Nina; Menten, Björn; Callewaert, Bert; De Paepe, Anne; Coucke, Paul; Willaert, Andy

    2016-07-27

    Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome.

  5. Enhancing genomics information retrieval through dimensional analysis.

    Science.gov (United States)

    Hu, Qinmin; Huang, Jimmy Xiangji

    2013-06-01

    We propose a novel dimensional analysis approach to employing meta information in order to find the relationships within the unstructured or semi-structured document/passages for improving genomics information retrieval performance. First, we make use of the auxiliary information as three basic dimensions, namely "temporal", "journal", and "author". The reference section is treated as a commensurable quantity of the three basic dimensions. Then, the sample space and subspaces are built up and a set of events are defined to meet the basic requirement of dimensional homogeneity to be commensurable quantities. After that, the classic graph analysis algorithm in the Web environments is applied on each dimension respectively to calculate the importance of each dimension. Finally, we integrate all the dimension networks and re-rank the outputs for evaluation. Our experimental results show the proposed approach is superior and promising.

  6. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    cells are capable of regulating their gene expression, so that each cell can only express a particular set of genes yielding limited numbers of proteins with specialized functions. Therefore a rigid control of differential gene expression is necessary for cellular diversity. On the other hand, aberrant...... gene regulation will disrupt the cell’s fundamental processes, which in turn can cause disease. Hence, understanding gene regulation is essential for deciphering the code of life. Along with the development of high throughput sequencing (HTS) technology and the subsequent large-scale data analysis......, genome-wide assays have increased our understanding of gene regulation significantly. This thesis describes the integration and analysis of HTS data across different important aspects of gene regulation. Gene expression can be regulated at different stages when the genetic information is passed from gene...

  7. Progress in TILLING as a tool for functional genomics and improvement of crops

    Institute of Scientific and Technical Information of China (English)

    Liang Chen; Liugen Hao; Martin A.J.Parry; Andrew L. Phillips; Yin-Gang Hu

    2014-01-01

    Food security is a global concern and substantial yield increases in crops are required to feed the growing world population. Mutagenesis is an important tool in crop improve-ment and is free of the regulatory restrictions imposed on genetical y modified organisms. Targeting Induced Local Lesions in Genomes (TILLING), which combines traditional chemical mutagenesis with high-throughput genome-wide screening for point mutations in desired genes, offers a powerful way to create novel mutant al eles for both functional genomics and improvement of crops. TILLING is general y applicable to genomes whether smal or large, diploid or even al ohexaploid, and shows great potential to address the major chal enge of linking sequence information to the function of genes and to modulate key traits for plant breeding. TILLING has been successful y applied in many crop species and recent progress in TILLING is summarized below, especial y on the developments in mutation detection technology, application of TILLING in gene functional studies and crop breeding. The potential of TILLING/EcoTILLING for functional genetics and crop improvement is also discussed. Furthermore, a smal-scale forward strategy including backcross and selfing was con-ducted to release the potential mutant phenotypes masked in M2 (or M3) plants.

  8. Multidimensional gene set analysis of genomic data.

    Directory of Open Access Journals (Sweden)

    David Montaner

    Full Text Available Understanding the functional implications of changes in gene expression, mutations, etc., is the aim of most genomic experiments. To achieve this, several functional profiling methods have been proposed. Such methods study the behaviour of different gene modules (e.g. gene ontology terms in response to one particular variable (e.g. differential gene expression. In spite to the wealth of information provided by functional profiling methods, a common limitation to all of them is their inherent unidimensional nature. In order to overcome this restriction we present a multidimensional logistic model that allows studying the relationship of gene modules with different genome-scale measurements (e.g. differential expression, genotyping association, methylation, copy number alterations, heterozygosity, etc. simultaneously. Moreover, the relationship of such functional modules with the interactions among the variables can also be studied, which produces novel results impossible to be derived from the conventional unidimensional functional profiling methods. We report sound results of gene sets associations that remained undetected by the conventional one-dimensional gene set analysis in several examples. Our findings demonstrate the potential of the proposed approach for the discovery of new cell functionalities with complex dependences on more than one variable.

  9. Genome Data Exploration Using Correspondence Analysis.

    Science.gov (United States)

    Tekaia, Fredj

    2016-01-01

    Recent developments of sequencing technologies that allow the production of massive amounts of genomic and genotyping data have highlighted the need for synthetic data representation and pattern recognition methods that can mine and help discovering biologically meaningful knowledge included in such large data sets. Correspondence analysis (CA) is an exploratory descriptive method designed to analyze two-way data tables, including some measure of association between rows and columns. It constructs linear combinations of variables, known as factors. CA has been used for decades to study high-dimensional data, and remarkable inferences from large data tables were obtained by reducing the dimensionality to a few orthogonal factors that correspond to the largest amount of variability in the data. Herein, I review CA and highlight its use by considering examples in handling high-dimensional data that can be constructed from genomic and genetic studies. Examples in amino acid compositions of large sets of species (viruses, phages, yeast, and fungi) as well as an example related to pairwise shared orthologs in a set of yeast and fungal species, as obtained from their proteome comparisons, are considered. For the first time, results show striking segregations between yeasts and fungi as well as between viruses and phages. Distributions obtained from shared orthologs show clusters of yeast and fungal species corresponding to their phylogenetic relationships. A direct comparison with the principal component analysis method is discussed using a recently published example of genotyping data related to newly discovered traces of an ancient hominid that was compared to modern human populations in the search for ancestral similarities. CA offers more detailed results highlighting links between modern humans and the ancient hominid and their characterizations. Compared to the popular principal component analysis method, CA allows easier and more effective interpretation of results

  10. Pig genome sequence - analysis and publication strategy

    NARCIS (Netherlands)

    Archibald, A.L.; Bolund, L.; Churcher, C.; Fredholm, M.; Groenen, M.A.M.; Harlizius, B.

    2010-01-01

    Background - The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. Results - Assemblies of the B

  11. Radiation induced genome instability: multiscale modelling and data analysis

    Science.gov (United States)

    Andreev, Sergey; Eidelman, Yuri

    2012-07-01

    Genome instability (GI) is thought to be an important step in cancer induction and progression. Radiation induced GI is usually defined as genome alterations in the progeny of irradiated cells. The aim of this report is to demonstrate an opportunity for integrative analysis of radiation induced GI on the basis of multiscale modelling. Integrative, systems level modelling is necessary to assess different pathways resulting in GI in which a variety of genetic and epigenetic processes are involved. The multilevel modelling includes the Monte Carlo based simulation of several key processes involved in GI: DNA double strand breaks (DSBs) generation in cells initially irradiated as well as in descendants of irradiated cells, damage transmission through mitosis. Taking the cell-cycle-dependent generation of DNA/chromosome breakage into account ensures an advantage in estimating the contribution of different DNA damage response pathways to GI, as to nonhomologous vs homologous recombination repair mechanisms, the role of DSBs at telomeres or interstitial chromosomal sites, etc. The preliminary estimates show that both telomeric and non-telomeric DSB interactions are involved in delayed effects of radiation although differentially for different cell types. The computational experiments provide the data on the wide spectrum of GI endpoints (dicentrics, micronuclei, nonclonal translocations, chromatid exchanges, chromosome fragments) similar to those obtained experimentally for various cell lines under various experimental conditions. The modelling based analysis of experimental data demonstrates that radiation induced GI may be viewed as processes of delayed DSB induction/interaction/transmission being a key for quantification of GI. On the other hand, this conclusion is not sufficient to understand GI as a whole because factors of DNA non-damaging origin can also induce GI. Additionally, new data on induced pluripotent stem cells reveal that GI is acquired in normal mature

  12. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol;

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies......) is under construction and will incorporate whole genome shotgun sequence (WGS) data providing > 30x genome coverage. The WGS sequence, most of which comprise short Illumina/Solexa reads, were generated from DNA from the same single Duroc sow as the source of the BAC library from which clones were...

  13. Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae

    Directory of Open Access Journals (Sweden)

    Yuan Huang

    2017-06-01

    Full Text Available Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in

  14. Genetic analysis of frontotemporal dementia and progressive supra nuclear palsy

    OpenAIRE

    Ferrari, R.

    2014-01-01

    Genome-wide association study (GWAS) is an effective method for mapping genetic variants underlying common and complex diseases. This thesis describes the investigation of the disorders, frontotemporal dementia (FTD) and progressive supranuclear palsy (PSP). FTD affects the frontal/temporal lobes and presents behavioural changes (bvFTD), cognitive decline or language dysfunction (primary progressive aphasia [PPA]), whilst PSP affects predominantly the brain stem resulting in loss of balance, ...

  15. The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

    Science.gov (United States)

    Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

    2013-01-01

    Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520

  16. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    Directory of Open Access Journals (Sweden)

    Guozheng Liu

    Full Text Available BACKGROUND: Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L. is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt genome could be helpful for the evolution research of plant mt genomes. METHODOLOGY/PRINCIPAL FINDINGS: We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. CONCLUSION: The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

  17. Millstone: software for multiplex microbial genome analysis and engineering.

    Science.gov (United States)

    Goodman, Daniel B; Kuznetsov, Gleb; Lajoie, Marc J; Ahern, Brian W; Napolitano, Michael G; Chen, Kevin Y; Chen, Changping; Church, George M

    2017-05-25

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. We describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  18. Clonal expansion and linear genome evolution through breast cancer progression from pre-invasive stages to asynchronous metastasis

    DEFF Research Database (Denmark)

    Krøigård, Anne Bruun; Larsen, Martin Jakob; Lænkholm, Anne Vibeke

    step. Our data, contrary to the proposed model of early dissemination of metastatic cells and parallel progression of primary tumors and metastases, provide evidence of linear progression of breast cancer with relatively late dissemination from the primary tumor. The genomic discordance between......Evolution of the breast cancer genome from pre-invasive stages to asynchronous metastasis is complex and mostly unexplored, but highly demanded as it may provide novel markers for and mechanistic insights in cancer progression. The increasing use of personalized therapy of breast cancer...... necessitates knowledge of the degree of genomic concordance between different steps of malignant progression as primary tumors often are used as surrogates of systemic disease. Based on exome sequencing we performed copy number profiling and point mutation detection on successive steps of breast cancer...

  19. Genome-wide alterations of the DNA replication program during tumor progression

    Science.gov (United States)

    Arneodo, A.; Goldar, A.; Argoul, F.; Hyrien, O.; Audit, B.

    2016-08-01

    Oncogenic stress is a major driving force in the early stages of cancer development. Recent experimental findings reveal that, in precancerous lesions and cancers, activated oncogenes may induce stalling and dissociation of DNA replication forks resulting in DNA damage. Replication timing is emerging as an important epigenetic feature that recapitulates several genomic, epigenetic and functional specificities of even closely related cell types. There is increasing evidence that chromosome rearrangements, the hallmark of many cancer genomes, are intimately associated with the DNA replication program and that epigenetic replication timing changes often precede chromosomic rearrangements. The recent development of a novel methodology to map replication fork polarity using deep sequencing of Okazaki fragments has provided new and complementary genome-wide replication profiling data. We review the results of a wavelet-based multi-scale analysis of genomic and epigenetic data including replication profiles along human chromosomes. These results provide new insight into the spatio-temporal replication program and its dynamics during differentiation. Here our goal is to bring to cancer research, the experimental protocols and computational methodologies for replication program profiling, and also the modeling of the spatio-temporal replication program. To illustrate our purpose, we report very preliminary results obtained for the chronic myelogeneous leukemia, the archetype model of cancer. Finally, we discuss promising perspectives on using genome-wide DNA replication profiling as a novel efficient tool for cancer diagnosis, prognosis and personalized treatment.

  20. SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

    Directory of Open Access Journals (Sweden)

    Davies Jonathan J

    2006-12-01

    Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.

  1. A Brief Review: The Z-curve Theory and its Application in Genome Analysis.

    Science.gov (United States)

    Zhang, Ren; Zhang, Chun-Ting

    2014-04-01

    In theoretical physics, there exist two basic mathematical approaches, algebraic and geometrical methods, which, in most cases, are complementary. In the area of genome sequence analysis, however, algebraic approaches have been widely used, while geometrical approaches have been less explored for a long time. The Z-curve theory is a geometrical approach to genome analysis. The Z-curve is a three-dimensional curve that represents a given DNA sequence in the sense that each can be uniquely reconstructed given the other. The Z-curve, therefore, contains all the information that the corresponding DNA sequence carries. The analysis of a DNA sequence can then be performed through studying the corresponding Z-curve. The Z-curve method has found applications in a wide range of areas in the past two decades, including the identifications of protein-coding genes, replication origins, horizontally-transferred genomic islands, promoters, translational start sides and isochores, as well as studies on phylogenetics, genome visualization and comparative genomics. Here, we review the progress of Z-curve studies from aspects of both theory and applications in genome analysis.

  2. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    Science.gov (United States)

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  3. Pathway and network analysis of cancer genomes

    DEFF Research Database (Denmark)

    Creixell, Pau; Reimand, Jueri; Haider, Syed

    2015-01-01

    Genomic information on tumors from 50 cancer types cataloged by the International Cancer Genome Consortium (ICGC) shows that only a few well-studied driver genes are frequently mutated, in contrast to many infrequently mutated genes that may also contribute to tumor biology. Hence there has been...

  4. Pfh1 Is an Accessory Replicative Helicase that Interacts with the Replisome to Facilitate Fork Progression and Preserve Genome Integrity.

    Directory of Open Access Journals (Sweden)

    Karin R McDonald

    2016-09-01

    Full Text Available Replicative DNA helicases expose the two strands of the double helix to the replication apparatus, but accessory helicases are often needed to help forks move past naturally occurring hard-to-replicate sites, such as tightly bound proteins, RNA/DNA hybrids, and DNA secondary structures. Although the Schizosaccharomyces pombe 5'-to-3' DNA helicase Pfh1 is known to promote fork progression, its genomic targets, dynamics, and mechanisms of action are largely unknown. Here we address these questions by integrating genome-wide identification of Pfh1 binding sites, comprehensive analysis of the effects of Pfh1 depletion on replication and DNA damage, and proteomic analysis of Pfh1 interaction partners by immunoaffinity purification mass spectrometry. Of the 621 high confidence Pfh1-binding sites in wild type cells, about 40% were sites of fork slowing (as marked by high DNA polymerase occupancy and/or DNA damage (as marked by high levels of phosphorylated H2A. The replication and integrity of tRNA and 5S rRNA genes, highly transcribed RNA polymerase II genes, and nucleosome depleted regions were particularly Pfh1-dependent. The association of Pfh1 with genomic integrity at highly transcribed genes was S phase dependent, and thus unlikely to be an artifact of high transcription rates. Although Pfh1 affected replication and suppressed DNA damage at discrete sites throughout the genome, Pfh1 and the replicative DNA polymerase bound to similar extents to both Pfh1-dependent and independent sites, suggesting that Pfh1 is proximal to the replication machinery during S phase. Consistent with this interpretation, Pfh1 co-purified with many key replisome components, including the hexameric MCM helicase, replicative DNA polymerases, RPA, and the processivity clamp PCNA in an S phase dependent manner. Thus, we conclude that Pfh1 is an accessory DNA helicase that interacts with the replisome and promotes replication and suppresses DNA damage at hard

  5. Identification of probable genomic packaging signal sequence from SARS—CoV genome by bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    QINLei; XIONGBin; LUOCheng; GUOZong-Ming; HAOPei; SUJiong; NANPeng; FENGYing; SHIYi-Xiang; YUXiao-Jing; LUOXiao-Min; CHENKai-Xian; SHENXu; SHENJian-Hua; ZOUJian-Ping; ZHAOGuo-Ping; SHITie-Liu; HEWei-Zhong; ZHONGYang; JIANGHua-Liang; LIYi-Xue

    2003-01-01

    AIM:To predict the probable genomic packaging signal of SARS-CoV by bioinformatics analysis. The derived packaging signal may be used to design antisense RNA and RNA interfere (RANi) drugs treating SARS. methods: Based on the studies about the genomic packaging signals of MHV and BCoV, especially the information about primary and secondary structures, the putative genomic packaging signal of SARS_CoV were analyzed by using bioinformatic tools. Multi-alignment for the genomic sequences was performed among SARS-CoV,MHV,BCoV, PEDV and HCoV 229E. Secondary structures of RNA sequences were also predicted for the identification fo the possible genomic packaging signals. Meanwhile, the N and M proteins of all five viruses were analyzed to study the evolutionary relationship with genomic packaging signals. RESULTS: The putative genomic packaging signal of SARS-CoV locates at the 3′ end of ORF1b near that of MHV and BCoV, where is the most variable region of this gene. The RNA secondary structure of SARS-CoV genomic packaging signal is very similar to that of MHV and BCoV. The same result was also obtained in studying the genomic packaging signals of PEDV and HCoV 229E. Further more, the genomic sequence multi-alignment indicated that the locations of packaging signals of SARS-CoV, PEDV, and HCoV overlaped each other. It seems that the mutation rate of packaging signal sequences is much higher than the N protein, while only subtle variations for the M protein. CONCLUSIONS: The probable genomic packaging signal of SARS-CoV is analogous to that of MHV and BCoV, with the corresponding secondary RNA structure locating at the similar region of ORF1b. The positions where genomic packaging signals exist have suffered rounds of mutations, which may influence the primary structures of the N and M proteins consequently.

  6. Genome analysis methods - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us PGDBj Registered...ear Year of genome analysis Sequencing method Sequencing method Read counts Read counts Covered genome region Covered...otation method Number of predicted genes Number of predicted genes Genome database Genome database informati... License Update History of This Database Site Policy | Contact Us Genome analysis... methods - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive ...

  7. GenomePeek—an online tool for prokaryotic genome and metagenome analysis

    Directory of Open Access Journals (Sweden)

    Katelyn McNair

    2015-06-01

    Full Text Available As more and more prokaryotic sequencing takes place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.

  8. Genomic analysis of plant chromosomes based on meiotic pairing

    Directory of Open Access Journals (Sweden)

    Lisete Chamma Davide

    2007-12-01

    Full Text Available This review presents the principles and applications of classical genomic analysis, with emphasis on plant breeding. The main mathematical models used to estimate the preferential chromosome pairing in diploid or polyploid, interspecific or intergenera hybrids are presented and discussed, with special reference to the applications and studies for the definition of genome relationships among species of the Poaceae family.

  9. Initial sequencing and analysis of the human genome.

    Science.gov (United States)

    Lander, E S; Linton, L M; Birren, B; Nusbaum, C; Zody, M C; Baldwin, J; Devon, K; Dewar, K; Doyle, M; FitzHugh, W; Funke, R; Gage, D; Harris, K; Heaford, A; Howland, J; Kann, L; Lehoczky, J; LeVine, R; McEwan, P; McKernan, K; Meldrim, J; Mesirov, J P; Miranda, C; Morris, W; Naylor, J; Raymond, C; Rosetti, M; Santos, R; Sheridan, A; Sougnez, C; Stange-Thomann, Y; Stojanovic, N; Subramanian, A; Wyman, D; Rogers, J; Sulston, J; Ainscough, R; Beck, S; Bentley, D; Burton, J; Clee, C; Carter, N; Coulson, A; Deadman, R; Deloukas, P; Dunham, A; Dunham, I; Durbin, R; French, L; Grafham, D; Gregory, S; Hubbard, T; Humphray, S; Hunt, A; Jones, M; Lloyd, C; McMurray, A; Matthews, L; Mercer, S; Milne, S; Mullikin, J C; Mungall, A; Plumb, R; Ross, M; Shownkeen, R; Sims, S; Waterston, R H; Wilson, R K; Hillier, L W; McPherson, J D; Marra, M A; Mardis, E R; Fulton, L A; Chinwalla, A T; Pepin, K H; Gish, W R; Chissoe, S L; Wendl, M C; Delehaunty, K D; Miner, T L; Delehaunty, A; Kramer, J B; Cook, L L; Fulton, R S; Johnson, D L; Minx, P J; Clifton, S W; Hawkins, T; Branscomb, E; Predki, P; Richardson, P; Wenning, S; Slezak, T; Doggett, N; Cheng, J F; Olsen, A; Lucas, S; Elkin, C; Uberbacher, E; Frazier, M; Gibbs, R A; Muzny, D M; Scherer, S E; Bouck, J B; Sodergren, E J; Worley, K C; Rives, C M; Gorrell, J H; Metzker, M L; Naylor, S L; Kucherlapati, R S; Nelson, D L; Weinstock, G M; Sakaki, Y; Fujiyama, A; Hattori, M; Yada, T; Toyoda, A; Itoh, T; Kawagoe, C; Watanabe, H; Totoki, Y; Taylor, T; Weissenbach, J; Heilig, R; Saurin, W; Artiguenave, F; Brottier, P; Bruls, T; Pelletier, E; Robert, C; Wincker, P; Smith, D R; Doucette-Stamm, L; Rubenfield, M; Weinstock, K; Lee, H M; Dubois, J; Rosenthal, A; Platzer, M; Nyakatura, G; Taudien, S; Rump, A; Yang, H; Yu, J; Wang, J; Huang, G; Gu, J; Hood, L; Rowen, L; Madan, A; Qin, S; Davis, R W; Federspiel, N A; Abola, A P; Proctor, M J; Myers, R M; Schmutz, J; Dickson, M; Grimwood, J; Cox, D R; Olson, M V; Kaul, R; Raymond, C; Shimizu, N; Kawasaki, K; Minoshima, S; Evans, G A; Athanasiou, M; Schultz, R; Roe, B A; Chen, F; Pan, H; Ramser, J; Lehrach, H; Reinhardt, R; McCombie, W R; de la Bastide, M; Dedhia, N; Blöcker, H; Hornischer, K; Nordsiek, G; Agarwala, R; Aravind, L; Bailey, J A; Bateman, A; Batzoglou, S; Birney, E; Bork, P; Brown, D G; Burge, C B; Cerutti, L; Chen, H C; Church, D; Clamp, M; Copley, R R; Doerks, T; Eddy, S R; Eichler, E E; Furey, T S; Galagan, J; Gilbert, J G; Harmon, C; Hayashizaki, Y; Haussler, D; Hermjakob, H; Hokamp, K; Jang, W; Johnson, L S; Jones, T A; Kasif, S; Kaspryzk, A; Kennedy, S; Kent, W J; Kitts, P; Koonin, E V; Korf, I; Kulp, D; Lancet, D; Lowe, T M; McLysaght, A; Mikkelsen, T; Moran, J V; Mulder, N; Pollara, V J; Ponting, C P; Schuler, G; Schultz, J; Slater, G; Smit, A F; Stupka, E; Szustakowki, J; Thierry-Mieg, D; Thierry-Mieg, J; Wagner, L; Wallis, J; Wheeler, R; Williams, A; Wolf, Y I; Wolfe, K H; Yang, S P; Yeh, R F; Collins, F; Guyer, M S; Peterson, J; Felsenfeld, A; Wetterstrand, K A; Patrinos, A; Morgan, M J; de Jong, P; Catanese, J J; Osoegawa, K; Shizuya, H; Choi, S; Chen, Y J; Szustakowki, J

    2001-02-15

    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

  10. Analysis of Simple Sequence Repeats in Genomes of Rhizobia

    Institute of Scientific and Technical Information of China (English)

    GAO Ya-mei; HAN Yi-qiang; TANG Hui; SUN Dong-mei; WANG Yan-jie; WANG Wei-dong

    2008-01-01

    Simple sequence repeats (SSRs) or microsatellites, as genetic markers, are ubiquitous in genomes of various organisms. The analysis of SSR in rhizobia genome provides useful information for a variety of applications in population genetics of rhizobia. We analyzed the occurrences, relative abundance, and relative density of SSRs, the most common in Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti genomes se-quenced in the microorganisms tandem repeats database, and SSRs in the three species genomes were compared with each other. The result showed that there were 1 410, 859, and 638 SSRs in B. japonicum, M. loti, and 5. meliloti genomes, respectively. In the genomes of B. japonicum, M. loti, and 5. meliloti, tetranucleotide, pentanucleotide, and hexanucleotide repeats were more abundant and indicated higher mutation rates in these species. The least abundance was mononucleotide repeat. The SSRs type and distribution were similar among these species.

  11. Analysis of intra-genomic GC content homogeneity within prokaryotes

    DEFF Research Database (Denmark)

    Bohlin, J; Snipen, L; Hardy, S.P.

    2010-01-01

    both aerobic and facultative microbes. Although an association has previously been found between mean genomic GC content and oxygen requirement, our analysis suggests that no such association exits when phylogenetic bias is accounted for. A significant association between GCVAR and mean GC content......Bacterial genomes possess varying GC content (total guanines (Gs) and cytosines (Cs) per total of the four bases within the genome) but within a given genome, GC content can vary locally along the chromosome, with some regions significantly more or less GC rich than on average. We have examined how...... the GC content varies within microbial genomes to assess whether this property can be associated with certain biological functions related to the organism's environment and phylogeny. We utilize a new quantity GCVAR, the intra-genomic GC content variability with respect to the average GC content...

  12. Digging Up the Human Genome: Current Progress in Deciphering Adverse Drug Reactions

    Directory of Open Access Journals (Sweden)

    Shih-Chi Su

    2014-01-01

    Full Text Available Adverse drug reactions (ADRs are a major clinical problem. In addition to their clinical impact on human health, there is an enormous cost associated with ADRs in health care and pharmaceutical industry. Increasing studies revealed that genetic variants can determine the susceptibility of individuals to ADRs. The development of modern genomic technologies has led to a tremendous advancement of improving the drug safety and efficacy and minimizing the ADRs. This review will discuss the pharmacogenomic techniques used to unveil the determinants of ADRs and summarize the current progresses concerning the identification of biomarkers for ADRs, with a focus on genetic variants for genes encoding drug-metabolizing enzymes, drug-transporter proteins, and human leukocyte antigen (HLA. The knowledge gained from these cutting-edge findings will form the basis for better prediction and management for ADRs, ultimately making the medicine personalized.

  13. Research Progress of Sugarcane Chloroplast Genome%甘蔗叶绿体基因组研究进展

    Institute of Scientific and Technical Information of China (English)

    吴杨; 周会

    2013-01-01

    Along with the development of modern molecular biology technologies, complete chloroplast genomes have been sequenced in various plant species to date, and the structure, function and expression of these genes have been deter-mined. The chloroplast genome structure in most higher plants is stable, since the gene number, arrangement and composition are conservative. The determination of sugarcane chloroplast genome sequence laid a good foundation for sugarcane chloroplast related research. This article gives a review on the research progress of sugarcane chloroplast genome through the chloroplast genome map, gene structure, function, chloroplast RNA editing, and phylogenetic analysis in Saccharum and relat-ed genera. This study held great potential to clarify more directions in researches, including sugarcane chloroplast genetic transformation, complete chloroplast nu-cleotide sequence determination in Saccharum and closely related genera, cpSSRs development and application.%随着现代分子生物学技术的发展,目前已经完成了多种植物叶绿体基因组的全序列测定,并研究了这些基因的结构、功能与表达。大部分高等植物的叶绿体基因组结构稳定,基因数量、排列顺序及组成上具有保守性。甘蔗叶绿体基因组测序工作的完成为甘蔗叶绿体相关研究奠定了良好基础。文章从甘蔗叶绿体基因组图谱、结构和功能基因、叶绿体RNA编辑以及甘蔗属叶绿体系统进化等方面综合概述了甘蔗叶绿体基因组研究取得的成果,并从甘蔗叶绿体遗传转化、甘蔗及近缘属叶绿体基因组测序和叶绿体基因组 cpSSRs开发利用等方面指出甘蔗叶绿体基因组今后的研究方向。

  14. Analysis of the Vibrionaceae pan-genome

    OpenAIRE

    Kahlke, Tim

    2013-01-01

    Paper 2 of this thesis is not available in Munin: 2. Tim Kahlke, Alexander Goesmann and Peik Haugen: 'The Vibrionaceae pan-genome hints at gene expression as the major driving force for unequal gene distributions on Vibrionaceae chromosomes' (manuscript) In the presented work the bacterial family Vibrionaceae was used as a model to investigate bacterial diversity on a gene level and to analyze the underlying concepts of bacterial niche adaptation and evolution. For this, the genomes ...

  15. Clonal expansion and linear genome evolution through breast cancer progression from pre-invasive stages to asynchronous metastasis

    DEFF Research Database (Denmark)

    Krøigård, Anne Bruun; Larsen, Martin Jakob; Lænkholm, Anne Vibeke;

    2015-01-01

    Evolution of the breast cancer genome from pre-invasive stages to asynchronous metastasis is complex and mostly unexplored, but highly demanded as it may provide novel markers for and mechanistic insights in cancer progression. The increasing use of personalized therapy of breast cancer necessita......Evolution of the breast cancer genome from pre-invasive stages to asynchronous metastasis is complex and mostly unexplored, but highly demanded as it may provide novel markers for and mechanistic insights in cancer progression. The increasing use of personalized therapy of breast cancer...... progression from one breast cancer patient, including two different regions of Ductal Carcinoma In Situ (DCIS), primary tumor and an asynchronous metastasis. We identify a remarkable landscape of somatic mutations, retained throughout breast cancer progression and with new mutational events emerging at each...

  16. Application of genomics-assisted breeding for generation of climate resilient crops: Progress and prospects

    Directory of Open Access Journals (Sweden)

    Chittaranjan eKole

    2015-08-01

    Full Text Available Climate change affects agricultural productivity worldwide. Increased prices of food commodities are the initial indication of drastic edible yield loss, which is expected to surge further due to global warming. This situation has compelled plant scientists to develop climate change-resilient crops, which can withstand broad-spectrum stresses such as drought, heat, cold, salinity, flood and submergence, and pests along with increased productivity. Genomics appears to be a promising tool for deciphering the stress responsiveness of crop species with adaptation traits or in wild relatives towards identifying underlying genes, alleles or quantitative trait loci. Molecular breeding approaches have been proven helpful in enhancing the stress adaptation of crop plants, and recent advancement in next-generation sequencing along with high-throughput sequencing and phenotyping platforms have transformed molecular breeding to genomics-assisted breeding (GAB. In view of this, the present review elaborates the progress and prospects of GAB in improving climate change resilience in crop plants towards circumventing global food insecurity.

  17. Genomics of Ovarian Cancer Progression Reveals Diverse Metastatic Trajectories Including Intraepithelial Metastasis to the Fallopian Tube.

    Science.gov (United States)

    Eckert, Mark A; Pan, Shawn; Hernandez, Kyle M; Loth, Rachel M; Andrade, Jorge; Volchenboum, Samuel L; Faber, Pieter; Montag, Anthony; Lastra, Ricardo; Peter, Marcus E; Yamada, S Diane; Lengyel, Ernst

    2016-12-01

    Accumulating evidence has supported the fallopian tube rather than the ovary as the origin for high-grade serous ovarian cancer (HGSOC). To understand the relationship between putative precursor lesions and metastatic tumors, we performed whole-exome sequencing on specimens from eight HGSOC patient progression series consisting of serous tubal intraepithelial carcinomas (STIC), invasive fallopian tube lesions, invasive ovarian lesions, and omental metastases. Integration of copy number and somatic mutations revealed patient-specific patterns with similar mutational signatures and copy-number variation profiles across all anatomic sites, suggesting that genomic instability is an early event in HGSOC. Phylogenetic analyses supported STIC as precursor lesions in half of our patient cohort, but also identified STIC as metastases in 2 patients. Ex vivo assays revealed that HGSOC spheroids can implant in the fallopian tube epithelium and mimic STIC lesions. That STIC may represent metastases calls into question the assumption that STIC are always indicative of primary fallopian tube cancers. We find that the putative precursor lesions for HGSOC, STIC, possess most of the genomic aberrations present in advanced cancers. In addition, a proportion of STIC represent intraepithelial metastases to the fallopian tube rather than the origin of HGSOC. Cancer Discov; 6(12); 1342-51. ©2016 AACR.See related commentary by Swisher et al., p. 1309This article is highlighted in the In This Issue feature, p. 1293. ©2016 American Association for Cancer Research.

  18. Whole Genome Amplification in Genomic Analysis of Single Circulating Tumor Cells.

    Science.gov (United States)

    Gasch, Christin; Pantel, Klaus; Riethdorf, Sabine

    2015-01-01

    Investigation of the genome of organisms is one of the major basics in molecular biology to understand the complex organization of cells. While genomic DNA can easily be isolated from tissues or cell cultures of plant, animal or human origin, DNA extraction from single cells is still challenging. Here, we describe three techniques for the amplification of genomic DNA of fixed single circulating tumor cells (CTC) isolated from blood of cancer patients. This amplification is aimed to increase DNA amounts from those of one cell to yields sufficient for different DNA analyses such as mutational analysis including next-generation sequencing, array-comparative genome hybridization (CGH), and quantitative measurement of gene amplifications. Molecular analysis of CTC as liquid biopsy can be used to identify therapeutic targets in personalized medicine directed, e.g. against human epidermal growth factor receptor 2 (HER2) or epidermal growth factor receptor (EGFR) and to stratify the patients to those therapies.

  19. Nonlinear Progressive Collapse Analysis Including Distributed Plasticity

    OpenAIRE

    Mohamed Osama Ahmed; Imam Zubair Syed; Khattab Rania

    2016-01-01

    This paper demonstrates the effect of incorporating distributed plasticity in nonlinear analytical models used to assess the potential for progressive collapse of steel framed regular building structures. Emphasis on this paper is on the deformation response under the notionally removed column, in a typical Alternate Path (AP) method. The AP method employed in this paper is based on the provisions of the Unified Facilities Criteria – Design of Buildings to Resist Progressive Collapse, develop...

  20. Chromosomes in the flow to simplify genome analysis.

    Science.gov (United States)

    Doležel, Jaroslav; Vrána, Jan; Safář, Jan; Bartoš, Jan; Kubaláková, Marie; Simková, Hana

    2012-08-01

    Nuclear genomes of human, animals, and plants are organized into subunits called chromosomes. When isolated into aqueous suspension, mitotic chromosomes can be classified using flow cytometry according to light scatter and fluorescence parameters. Chromosomes of interest can be purified by flow sorting if they can be resolved from other chromosomes in a karyotype. The analysis and sorting are carried out at rates of 10(2)-10(4) chromosomes per second, and for complex genomes such as wheat the flow sorting technology has been ground-breaking in reducing genome complexity for genome sequencing. The high sample rate provides an attractive approach for karyotype analysis (flow karyotyping) and the purification of chromosomes in large numbers. In characterizing the chromosome complement of an organism, the high number that can be studied using flow cytometry allows for a statistically accurate analysis. Chromosome sorting plays a particularly important role in the analysis of nuclear genome structure and the analysis of particular and aberrant chromosomes. Other attractive but not well-explored features include the analysis of chromosomal proteins, chromosome ultrastructure, and high-resolution mapping using FISH. Recent results demonstrate that chromosome flow sorting can be coupled seamlessly with DNA array and next-generation sequencing technologies for high-throughput analyses. The main advantages are targeting the analysis to a genome region of interest and a significant reduction in sample complexity. As flow sorters can also sort single copies of chromosomes, shotgun sequencing DNA amplified from them enables the production of haplotype-resolved genome sequences. This review explains the principles of flow cytometric chromosome analysis and sorting (flow cytogenetics), discusses the major uses of this technology in genome analysis, and outlines future directions.

  1. Analysis of intra-genomic GC content homogeneity within prokaryotes

    Directory of Open Access Journals (Sweden)

    Bohlin Jon

    2010-08-01

    Full Text Available Abstract Background Bacterial genomes possess varying GC content (total guanines (Gs and cytosines (Cs per total of the four bases within the genome but within a given genome, GC content can vary locally along the chromosome, with some regions significantly more or less GC rich than on average. We have examined how the GC content varies within microbial genomes to assess whether this property can be associated with certain biological functions related to the organism's environment and phylogeny. We utilize a new quantity GCVAR, the intra-genomic GC content variability with respect to the average GC content of the total genome. A low GCVAR indicates intra-genomic GC homogeneity and high GCVAR heterogeneity. Results The regression analyses indicated that GCVAR was significantly associated with domain (i.e. archaea or bacteria, phylum, and oxygen requirement. GCVAR was significantly higher among anaerobes than both aerobic and facultative microbes. Although an association has previously been found between mean genomic GC content and oxygen requirement, our analysis suggests that no such association exits when phylogenetic bias is accounted for. A significant association between GCVAR and mean GC content was also found but appears to be non-linear and varies greatly among phyla. Conclusions Our findings show that GCVAR is linked with oxygen requirement, while mean genomic GC content is not. We therefore suggest that GCVAR should be used as a complement to mean GC content.

  2. Comparative genomic analysis of eutherian interferon-γ-inducible GTPases.

    Science.gov (United States)

    Premzl, Marko

    2012-11-01

    The interferon-γ-inducible GTPases, IFGGs, are intracellular proteins involved in immune response against pathogens. A comprehensive comparative genomic review and analysis of eutherian IFGGs was carried out using public genomic sequences. The 64 eutherian IFGG genes were examined in detail and annotated. The eutherian IFGG promoter types were first catalogued followed by a phylogenetic analysis of eutherian IFGGs, which described five major IFGG clusters. The patterns of differential gene expansions and protein regions that may regulate IFGG catalytic features suggested a new classification of eutherian IFGGs. This mini-review has also provided new tests of reliability of public genomic sequences as well as tests of protein molecular evolution.

  3. Genome-Wide Association of CKD Progression: The Chronic Renal Insufficiency Cohort Study.

    Science.gov (United States)

    Parsa, Afshin; Kanetsky, Peter A; Xiao, Rui; Gupta, Jayanta; Mitra, Nandita; Limou, Sophie; Xie, Dawei; Xu, Huichun; Anderson, Amanda Hyre; Ojo, Akinlolu; Kusek, John W; Lora, Claudia M; Hamm, L Lee; He, Jiang; Sandholm, Niina; Jeff, Janina; Raj, Dominic E; Böger, Carsten A; Bottinger, Erwin; Salimi, Shabnam; Parekh, Rulan S; Adler, Sharon G; Langefeld, Carl D; Bowden, Donald W; Groop, Per-Henrik; Forsblom, Carol; Freedman, Barry I; Lipkowitz, Michael; Fox, Caroline S; Winkler, Cheryl A; Feldman, Harold I

    2017-03-01

    The rate of decline of renal function varies significantly among individuals with CKD. To understand better the contribution of genetics to CKD progression, we performed a genome-wide association study among participants in the Chronic Renal Insufficiency Cohort Study. Our outcome of interest was CKD progression measured as change in eGFR over time among 1331 blacks and 1476 whites with CKD. We stratified all analyses by race and subsequently, diabetes status. Single-nucleotide polymorphisms (SNPs) that surpassed a significance threshold of P<1×10(-6) for association with eGFR slope were selected as candidates for follow-up and secondarily tested for association with proteinuria and time to ESRD. We identified 12 such SNPs among black patients and six such SNPs among white patients. We were able to conduct follow-up analyses of three candidate SNPs in similar (replication) cohorts and eight candidate SNPs in phenotype-related (validation) cohorts. Among blacks without diabetes, rs653747 in LINC00923 replicated in the African American Study of Kidney Disease and Hypertension cohort (discovery P=5.42×10(-7); replication P=0.039; combined P=7.42×10(-9)). This SNP also associated with ESRD (hazard ratio, 2.0 (95% confidence interval, 1.5 to 2.7); P=4.90×10(-6)). Similarly, rs931891 in LINC00923 associated with eGFR decline (P=1.44×10(-4)) in white patients without diabetes. In summary, SNPs in LINC00923, an RNA gene expressed in the kidney, significantly associated with CKD progression in individuals with nondiabetic CKD. However, the lack of equivalent cohorts hampered replication for most discovery loci. Further replication of our findings in comparable study populations is warranted.

  4. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    Science.gov (United States)

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-11-20

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  5. Mycobacterial species as case-study of comparative genome analysis.

    Science.gov (United States)

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-02-08

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species.

  6. Draft Genome Sequences of Two Propionibacterium acnes Strains Isolated from Progressive Macular Hypomelanosis Lesions of Human Skin

    DEFF Research Database (Denmark)

    Petersen, Rolf; Lomholt, Hans B.; Scholz, Christian F. P.;

    2015-01-01

    Propionibacterium acnes is a Gram-positive bacterium that is prevalent on human skin. It has been associated with skin disorders such as acne vulgaris and progressive macular hypomelanosis (PMH). Here, we report draft genome sequences of two type III P. acnes strains, PMH5 and PMH7, isolated from...

  7. Clonal expansion and linear genome evolution through breast cancer progression from pre-invasive stages to asynchronous metastasis

    DEFF Research Database (Denmark)

    Krøigård, Anne Bruun; Larsen, Martin Jakob; Lænkholm, Anne-Vibeke

    2015-01-01

    Evolution of the breast cancer genome from pre-invasive stages to asynchronous metastasis is complex and mostly unexplored, but highly demanded as it may provide novel markers for and mechanistic insights in cancer progression. The increasing use of personalized therapy of breast cancer necessita...

  8. Draft Genome Sequences of Two Propionibacterium acnes Strains Isolated from Progressive Macular Hypomelanosis Lesions of Human Skin

    DEFF Research Database (Denmark)

    Petersen, Rolf; Lomholt, Hans B.; Scholz, Christian F. P.

    2015-01-01

    Propionibacterium acnes is a Gram-positive bacterium that is prevalent on human skin. It has been associated with skin disorders such as acne vulgaris and progressive macular hypomelanosis (PMH). Here, we report draft genome sequences of two type III P. acnes strains, PMH5 and PMH7, isolated from...

  9. Hyperstructures, genome analysis and I-cells

    DEFF Research Database (Denmark)

    Amar, P.; Ballet, P.; Barlovatz-Meimon, G.

    2002-01-01

    New concepts may prove necessary to profit from the avalanche of sequence data on the genome, transcriptome, proteome and interactome and to relate this information to cell physiology. Here, we focus on the concept of large activity-based structures, or hyperstructures, in which a variety of type...

  10. Whole genome sequencing analysis of Plasmodium vivax using whole genome capture

    Directory of Open Access Journals (Sweden)

    Bright A

    2012-06-01

    Full Text Available Abstract Background Malaria caused by Plasmodium vivax is an experimentally neglected severe disease with a substantial burden on human health. Because of technical limitations, little is known about the biology of this important human pathogen. Whole genome analysis methods on patient-derived material are thus likely to have a substantial impact on our understanding of P. vivax pathogenesis and epidemiology. For example, it will allow study of the evolution and population biology of the parasite, allow parasite transmission patterns to be characterized, and may facilitate the identification of new drug resistance genes. Because parasitemias are typically low and the parasite cannot be readily cultured, on-site leukocyte depletion of blood samples is typically needed to remove human DNA that may be 1000X more abundant than parasite DNA. These features have precluded the analysis of archived blood samples and require the presence of laboratories in close proximity to the collection of field samples for optimal pre-cryopreservation sample preparation. Results Here we show that in-solution hybridization capture can be used to extract P. vivax DNA from human contaminating DNA in the laboratory without the need for on-site leukocyte filtration. Using a whole genome capture method, we were able to enrich P. vivax DNA from bulk genomic DNA from less than 0.5% to a median of 55% (range 20%-80%. This level of enrichment allows for efficient analysis of the samples by whole genome sequencing and does not introduce any gross biases into the data. With this method, we obtained greater than 5X coverage across 93% of the P. vivax genome for four P. vivax strains from Iquitos, Peru, which is similar to our results using leukocyte filtration (greater than 5X coverage across 96% . Conclusion The whole genome capture technique will enable more efficient whole genome analysis of P. vivax from a larger geographic region and from valuable archived sample collections.

  11. High-resolution array comparative genomic hybridization of chromosome 8q: evaluation of putative progression markers for gastroesophageal junction adenocarcinomas.

    Science.gov (United States)

    van Duin, M; van Marion, R; Vissers, K J; Hop, W C J; Dinjens, W N M; Tilanus, H W; Siersema, P D; van Dekken, H

    2007-01-01

    Amplification of 8q is frequently found in gastroesophageal junction (GEJ) cancer. It is usually detected in high-grade, high-stage GEJ adenocarcinomas. Moreover, it has been implicated in tumor progression in other cancer types. In this study, a detailed genomic analysis of 8q was performed on a series of GEJ adenocarcinomas, including 22 primary adenocarcinomas, 13 cell lines and two xenografts, by array comparative genomic hybridization (aCGH) with a whole chromosome 8q contig array. Of the 37 specimens, 21 originated from the esophagus and 16 were derived from the gastric cardia. Commonly overrepresented regions were identified at distal 8q, i.e. 124-125 Mb (8q24.13), at 127-128 Mb (8q24.21), and at 141-142 Mb (8q24.3). From these regions six genes were selected with putative relevance to cancer: ANXA13, MTSS1, FAM84B (alias NSE2), MYC, C8orf17 (alias MOST-1) and PTK2 (alias FAK). In addition, the gene EXT1 was selected since it was found in a specific amplification in cell line SK-GT-5. Quantitative RT-PCR analysis of these seven genes was subsequently performed on a panel of 24 gastroesophageal samples, including 13 cell lines, two xenografts and nine normal stomach controls. Significant overexpression was found for MYC and EXT1 in GEJ adenocarcinoma cell lines and xenografts compared to normal controls. Expression of the genes MTSS1, FAM84B and C8orf17 was found to be significantly decreased in this set of cell lines and xenografts. We conclude that, firstly, there are other genes than MYC involved in the 8q amplification in GEJ cancer. Secondly, the differential expression of these genes contributes to unravel the biology of GEJ adenocarcinomas.

  12. Nonlinear Progressive Collapse Analysis Including Distributed Plasticity

    Directory of Open Access Journals (Sweden)

    Mohamed Osama Ahmed

    2016-01-01

    Full Text Available This paper demonstrates the effect of incorporating distributed plasticity in nonlinear analytical models used to assess the potential for progressive collapse of steel framed regular building structures. Emphasis on this paper is on the deformation response under the notionally removed column, in a typical Alternate Path (AP method. The AP method employed in this paper is based on the provisions of the Unified Facilities Criteria – Design of Buildings to Resist Progressive Collapse, developed and updated by the U.S. Department of Defense [1]. The AP method is often used for to assess the potential for progressive collapse of building structures that fall under Occupancy Category III or IV. A case study steel building is used to examine the effect of incorporating distributed plasticity, where moment frames were used on perimeter as well as the interior of the three dimensional structural system. It is concluded that the use of moment resisting frames within the structural system will enhance resistance to progressive collapse through ductile deformation response and that it is conserative to ignore the effects of distributed plasticity in determining peak displacement response under the notionally removed column.

  13. Progress Testing: Critical Analysis and Suggested Practices

    Science.gov (United States)

    Albanese, Mark; Case, Susan M.

    2016-01-01

    Educators have long lamented the tendency of students to engage in rote memorization in preparation for tests rather than engaging in deep learning where they attempt to gain meaning from their studies. Rote memorization driven by objective exams has been termed a steering effect. Progress testing (PT), in which a comprehensive examination…

  14. Genetic Association Analysis of Drusen Progression

    NARCIS (Netherlands)

    Hoffman, J.D.; Grinsven, M.J.J.P. van; Li, C.; Brantley, M., Jr.; McGrath, J.; Agarwal, A.; Scott, W.K.; Schwartz, S.G.; Kovach, J.; Pericak-Vance, M.; Sanchez, C.I.; Haines, J.L.

    2016-01-01

    PURPOSE: Age-related macular degeneration is a common form of vision loss affecting older adults. The etiology of AMD is multifactorial and is influenced by environmental and genetic risk factors. In this study, we examine how 19 common risk variants contribute to drusen progression, a hallmark of

  15. Genomics of Sorghum

    OpenAIRE

    PATERSON, ANDREW H

    2008-01-01

    Sorghum (Sorghum bicolor (L.) Moench) is a subject of plant genomics research based on its importance as one of the world's leading cereal crops, a biofuels crop of high and growing importance, a progenitor of one of the world's most noxious weeds, and a botanical model for many tropical grasses with complex genomes. A rich history of genome analysis, culminating in the recent complete sequencing of the genome of a leading inbred, provides a foundation for invigorating progress toward relatin...

  16. Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome

    Directory of Open Access Journals (Sweden)

    Childs Kevin L

    2010-11-01

    Full Text Available Abstract Background A goal of the Bovine Genome Database (BGD; http://BovineGenome.org has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. Results and Discussion BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. Conclusions We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence.

  17. Microarray comparative genomic hybridisation analysis incorporating genomic organisation, and application to enterobacterial plant pathogens.

    Directory of Open Access Journals (Sweden)

    Leighton Pritchard

    2009-08-01

    Full Text Available Microarray comparative genomic hybridisation (aCGH provides an estimate of the relative abundance of genomic DNA (gDNA taken from comparator and reference organisms by hybridisation to a microarray containing probes that represent sequences from the reference organism. The experimental method is used in a number of biological applications, including the detection of human chromosomal aberrations, and in comparative genomic analysis of bacterial strains, but optimisation of the analysis is desirable in each problem domain.We present a method for analysis of bacterial aCGH data that encodes spatial information from the reference genome in a hidden Markov model. This technique is the first such method to be validated in comparisons of sequenced bacteria that diverge at the strain and at the genus level: Pectobacterium atrosepticum SCRI1043 (Pba1043 and Dickeya dadantii 3937 (Dda3937; and Lactococcus lactis subsp. lactis IL1403 and L. lactis subsp. cremoris MG1363. In all cases our method is found to outperform common and widely used aCGH analysis methods that do not incorporate spatial information. This analysis is applied to comparisons between commercially important plant pathogenic soft-rotting enterobacteria (SRE Pba1043, P. atrosepticum SCRI1039, P. carotovorum 193, and Dda3937.Our analysis indicates that it should not be assumed that hybridisation strength is a reliable proxy for sequence identity in aCGH experiments, and robustly extends the applicability of aCGH to bacterial comparisons at the genus level. Our results in the SRE further provide evidence for a dynamic, plastic 'accessory' genome, revealing major genomic islands encoding gene products that provide insight into, and may play a direct role in determining, variation amongst the SRE in terms of their environmental survival, host range and aetiology, such as phytotoxin synthesis, multidrug resistance, and nitrogen fixation.

  18. Copy Number Variation Analysis by Array Analysis of Single Cells Following Whole Genome Amplification.

    Science.gov (United States)

    Dimitriadou, Eftychia; Zamani Esteki, Masoud; Vermeesch, Joris Robert

    2015-01-01

    Whole genome amplification is required to ensure the availability of sufficient material for copy number variation analysis of a genome deriving from an individual cell. Here, we describe the protocols we use for copy number variation analysis of non-fixed single cells by array-based approaches following single-cell isolation and whole genome amplification. We are focusing on two alternative protocols, an isothermal and a PCR-based whole genome amplification method, followed by either comparative genome hybridization (aCGH) or SNP array analysis, respectively.

  19. MIPS: analysis and annotation of proteins from whole genomes.

    Science.gov (United States)

    Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

    2004-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  20. A process for analysis of microarray comparative genomics hybridisation studies for bacterial genomes

    Directory of Open Access Journals (Sweden)

    Woodward Martin J

    2008-01-01

    Full Text Available Abstract Background Microarray based comparative genomic hybridisation (CGH experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.

  1. Pan-cancer analysis of ROS1 genomic aberrations

    OpenAIRE

    Wang, Yidan; 王奕丹

    2015-01-01

    The ROS proto-oncogene 1 (ROS1) encodes the ROS1 receptor kinase. ROS1 rearrangements are known to be oncogenic in glioblastoma, non–small-cell lung carcinoma (NSCLC) and cholangiocarcinoma. The clinical relevance of ROS1 genomic aberrations in other human cancers is largely unexamined. Here, we performed a pan-cancer analysis of ROS1 genomic aberrations across 20 cancer sites by interrogating the whole-exome sequencing data of the Cancer Genome Atlas (TCGA) via the cBioportal (www.cbioportal...

  2. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    Science.gov (United States)

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  3. Single-cell analysis in cancer genomics

    Science.gov (United States)

    Saadatpour, Assieh; Lai, Shujing; Guo, Guoji; Yuan, Guo-Cheng

    2017-01-01

    Genetic changes and environmental differences result in cellular heterogeneity among cancer cells within the same tumor, thereby complicating treatment outcomes. Recent advances in single-cell technologies have opened new avenues to characterize the intra-tumor cellular heterogeneity, identify rare cell types, measure mutation rates, and, ultimately, guide diagnosis and treatment. In this paper, we review the recent single-cell technological and computational advances at the genomic, transcriptomic, and proteomic levels, and discuss their applications in cancer research. PMID:26450340

  4. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    Directory of Open Access Journals (Sweden)

    Jiuzhou Song

    2004-01-01

    Full Text Available Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  5. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    Science.gov (United States)

    Song, Jiuzhou; Ware, Tony; Liu, Shu-Lin; Surette, M.

    2004-12-01

    Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  6. Genomic analysis of epithelial ovarian cancer

    Institute of Scientific and Technical Information of China (English)

    John Farley; Laurent L Ozbun; Michael J Birrer

    2008-01-01

    Ovarian cancer is a major health problem for women in the United States.Despite evidence of considerable heterogeneity,most cases of ovarian cancer are treated in a similar fashion.The molecular basis for the clinicopathologic characteristics of these tumors remains poorly defined.Whole genome expression profiling is a genomic tool,which can identify dysregulated genes and uncover unique sub-classes of tumors.The application of this technology to ovarian cancer has provided a solid molecular basis for differences in histology and grade of ovarian tumors.Differentially expressed genes identified pathways implicated in cell proliferation,invasion,motility,chromosomal instability,and gene silencing and provided new insights into the origin and potential treatment of these cancers.The added knowledge provided by global gene expression profiling should allow for a more rational treatment of ovarian cancers.These techniques are leading to a paradigm shift from empirical treatment to an individually tailored approach.This review summarizes the new genomic data on epithelial ovarian cancers of different histology and grade and the impact it will have on our understanding and treatment of this disease.

  7. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Directory of Open Access Journals (Sweden)

    Qin Xiang

    2012-07-01

    Full Text Available Abstract Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA strains (including STs 16, 17, 18, and 78, in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains

  8. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    Science.gov (United States)

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  9. Genome-wide analysis of multi-ancestry cohorts identifies new loci influencing intraocular pressure and susceptibility to glaucoma

    NARCIS (Netherlands)

    P.G. Hysi (Pirro); C-Y. Cheng (Ching-Yu); H. Springelkamp (Henriët); S. MacGregor (Stuart); J.N.C. Bailey (Jessica N. Cooke); R. Wojciechowski (Robert); V. Vitart (Veronique); A. Nag (Abhishek); A.W. Hewit (Alex); R. Höhn (René); C. Venturini (Cristina); A. Mirshahi (Alireza); W.D. Ramdas (Wishal); G. Thorleifsson (Gudmar); E.N. Vithana (Eranga); C.C. Khor; A.B. Stefansson (Arni B.); J. Liao (Jie); J.L. Haines (Jonathan); N. Amin (Najaf); Y. Wang (Ying); P.S. Wild (Philipp S.); A.B. Ozel (Ayse B.); J. Li; B.W. Fleck (Brian W.); T. Zeller (Tanja); S.E. Staffieri (Sandra E.); Y.Y. Teo (Yik Ying); G. Cuellar-Partida (Gabriel); X. Luo (Xiaoyan); R.R. Allingham (R Rand); J.E. Richards (Julia); A. Senft (Andrea); L.C. Karssen (Lennart); Y. Zheng (Yingfeng); C. Bellenguez (Céline); L. Xu (Liang); A.I. Iglesias González (Adriana); J.F. Wilson (James F); J.H. Kang (Jae H.); E.M. van Leeuwen (Elisa); V. Jonsson (Vesteinn); U. Thorsteinsdottir (Unnur); D.D.G. Despriet (Dominique); S. Ennis (Sarah); S.E. Moroi (Sayoko); N.G. Martin (Nicholas); N.M. Jansonius (Nomdo); S. Yazar (Seyhan); E.S. Tai (Shyong); P. Amouyel (Philippe); J. Kirwan (James); L.M.E. van Koolwijk (Leonieke); M.A. Hauser (Michael); F. Jonasson (Fridbert); P.J. Leo (Paul); S.J. Loomis (Stephanie J.); R. Fogarty (Rhys); F. Rivadeneira Ramirez (Fernando); L.S. Kearns (Lisa S.); K.J. Lackner (Karl); P.T.V.M. de Jong (Paulus); C.L. Simpson (Claire); C.E. Pennell (Craig); B.A. Oostra (Ben); A.G. Uitterlinden (André); S-M. Saw (Seang-Mei); A.J. Lotery (Andrew); J.E. Bailey-Wilson (Joan E.); A. Hofman (Albert); J.R. Vingerling (Hans); C. Maubaret (Cécilia); A.F.H. Pfeiffer (Andreas); R.C.W. Wolfs (Roger); H.G. Lemij (Hans); T.L. Young (Terri); L.R. Pasquale (Louis); C. Delcourt (Cécile); T.D. Spector (Timothy); C.C.W. Klaver (Caroline); K.S. Small (Kerrin); K.P. Burdon (Kathryn); J-A. Zwart (John-Anker); T.Y. Wong (Tien); A.C. Viswanathan (Ananth); D.A. Mackey (David); J.E. Craig (Jamie); J.L. Wiggs (Janey); C.M. van Duijn (Cock); C.J. Hammond (Christopher); T. Aung (Tin)

    2014-01-01

    textabstractElevated intraocular pressure (IOP) is an important risk factor in developing glaucoma, and variability in IOP might herald glaucomatous development or progression. We report the results of a genome-wide association study meta-analysis of 18 population cohorts from the International

  10. Genome-wide analysis of multi-ancestry cohorts identifies new loci influencing intraocular pressure and susceptibility to glaucoma

    NARCIS (Netherlands)

    Hysi, Pirro G; Cheng, Ching-Yu; Springelkamp, Henriët; Macgregor, Stuart; Bailey, Jessica N Cooke; Wojciechowski, Robert; Vitart, Veronique; Nag, Abhishek; Hewitt, Alex W; Höhn, René; Venturini, Cristina; Mirshahi, Alireza; Ramdas, Wishal D; Thorleifsson, Gudmar; Vithana, Eranga; Khor, Chiea-Chuen; Stefansson, Arni B; Liao, Jiemin; Haines, Jonathan L; Amin, Najaf; Wang, Ya Xing; Wild, Philipp S; Ozel, Ayse B; Li, Jun Z; Fleck, Brian W; Zeller, Tanja; Staffieri, Sandra E; Teo, Yik-Ying; Cuellar-Partida, Gabriel; Luo, Xiaoyan; Allingham, R Rand; Richards, Julia E; Senft, Andrea; Karssen, Lennart C; Zheng, Yingfeng; Bellenguez, Céline; Xu, Liang; Iglesias, Adriana I; Wilson, James F; Kang, Jae H; van Leeuwen, Elisabeth M; Jonsson, Vesteinn; Thorsteinsdottir, Unnur; Despriet, Dominiek D G; Ennis, Sarah; Moroi, Sayoko E; Martin, Nicholas G; Jansonius, Nomdo M; Yazar, Seyhan; Tai, E-Shyong; Amouyel, Philippe; Kirwan, James; van Koolwijk, Leonieke M E; Hauser, Michael A; Jonasson, Fridbert; Leo, Paul; Loomis, Stephanie J; Fogarty, Rhys; Rivadeneira, Fernando; Kearns, Lisa; Lackner, Karl J; de Jong, Paulus T V M; Simpson, Claire L; Pennell, Craig E; Oostra, Ben A; Uitterlinden, André G; Saw, Seang-Mei; Lotery, Andrew J; Bailey-Wilson, Joan E; Hofman, Albert; Vingerling, Johannes R; Maubaret, Cécilia; Pfeiffer, Norbert; Wolfs, Roger C W; Lemij, Hans G; Young, Terri L; Pasquale, Louis R; Delcourt, Cécile; Spector, Timothy D; Klaver, Caroline C W; Small, Kerrin S; Burdon, Kathryn P; Stefansson, Kari; Wong, Tien-Yin; Viswanathan, Ananth; Mackey, David A; Craig, Jamie E; Wiggs, Janey L; van Duijn, Cornelia M; Hammond, Christopher J; Aung, Tin

    2014-01-01

    Elevated intraocular pressure (IOP) is an important risk factor in developing glaucoma, and variability in IOP might herald glaucomatous development or progression. We report the results of a genome-wide association study meta-analysis of 18 population cohorts from the International Glaucoma

  11. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  12. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis.

    Science.gov (United States)

    Jun, Se-Ran; Wassenaar, Trudy M; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A; Ussery, David W

    2015-10-30

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.

  13. Genome-Wide Search for Host Association Factors during Ovine Progressive Pneumonia Virus Infection.

    Directory of Open Access Journals (Sweden)

    Jesse Thompson

    Full Text Available Ovine progressive pneumonia virus (OPPV is an important virus that causes serious diseases in sheep and goats with a prevalence of 36% in the USA. Although OPPV was discovered more than half of a century ago, little is known about the infection and pathogenesis of this virus. In this report, we used RNA-seq technology to conduct a genome-wide probe for cellular factors that are associated with OPPV infection. A total of approximately 22,000 goat host genes were detected of which 657 were found to have been significantly up-regulated and 889 down-regulated at 12 hours post-infection. In addition to previously known restriction factors from other viral infections, a number of factors which may be specific for OPPV infection were uncovered. The data from this RNA-seq study will be helpful in our understanding of OPPV infection, and also for further study in the prevention and intervention of this viral disease.

  14. Analysis of the Core Genome and Pan-Genome of Autotrophic Acetogenic Bacteria

    Science.gov (United States)

    Shin, Jongoh; Song, Yoseb; Jeong, Yujin; Cho, Byung-Kwan

    2016-01-01

    Acetogens are obligate anaerobic bacteria capable of reducing carbon dioxide (CO2) to multicarbon compounds coupled to the oxidation of inorganic substrates, such as hydrogen (H2) or carbon monoxide (CO), via the Wood-Ljungdahl pathway. Owing to the metabolic capability of CO2 fixation, much attention has been focused on understanding the unique pathways associated with acetogens, particularly their metabolic coupling of CO2 fixation to energy conservation. Most known acetogens are phylogenetically and metabolically diverse bacteria present in 23 different bacterial genera. With the increased volume of available genome information, acetogenic bacterial genomes can be analyzed by comparative genome analysis. Even with the genetic diversity that exists among acetogens, the Wood-Ljungdahl pathway, a central metabolic pathway, and cofactor biosynthetic pathways are highly conserved for autotrophic growth. Additionally, comparative genome analysis revealed that most genes in the acetogen-specific core genome were associated with the Wood-Ljungdahl pathway. The conserved enzymes and those predicted as missing can provide insight into biological differences between acetogens and allow for the discovery of promising candidates for industrial applications. PMID:27733845

  15. Analysis of the core genome and pan-genome of autotrophic acetogenic bacteria

    Directory of Open Access Journals (Sweden)

    JongOh Shin

    2016-09-01

    Full Text Available Acetogens are obligate anaerobic bacteria capable of reducing carbon dioxide (CO2 to multicarbon compounds coupled to the oxidation of inorganic substrates, such as hydrogen (H2 or carbon monoxide (CO, via the Wood-Ljungdahl pathway. Owing to the metabolic capability of CO2 fixation, much attention has been focused on understanding the unique pathways associated with acetogens, particularly their metabolic coupling of CO2 fixation to energy conservation. Most known acetogens are phylogenetically and metabolically diverse bacteria present in 23 different bacterial genera. With the increased volume of available genome information, acetogenic bacterial genomes can be analyzed by comparative genome analysis. Even with the genetic diversity that exists among acetogens, the Wood-Ljungdahl pathway, a central metabolic pathway, and cofactor biosynthetic pathways are highly conserved for autotrophic growth. Additionally, comparative genome analysis revealed that most genes in the acetogen-specific core genome were associated with the Wood-Ljungdahl pathway. The conserved enzymes and those predicted as missing can provide insight into biological differences between acetogens and allow for the discovery of promising candidates for industrial applications.

  16. Getting personalized cancer genome analysis into the clinic: the challenges in bioinformatics.

    Science.gov (United States)

    Valencia, Alfonso; Hidalgo, Manuel

    2012-01-01

    Progress in genomics has raised expectations in many fields, and particularly in personalized cancer research. The new technologies available make it possible to combine information about potential disease markers, altered function and accessible drug targets, which, coupled with pathological and medical information, will help produce more appropriate clinical decisions. The accessibility of such experimental techniques makes it all the more necessary to improve and adapt computational strategies to the new challenges. This review focuses on the critical issues associated with the standard pipeline, which includes: DNA sequencing analysis; analysis of mutations in coding regions; the study of genome rearrangements; extrapolating information on mutations to the functional and signaling level; and predicting the effects of therapies using mouse tumor models. We describe the possibilities, limitations and future challenges of current bioinformatics strategies for each of these issues. Furthermore, we emphasize the need for the collaboration between the bioinformaticians who implement the software and use the data resources, the computational biologists who develop the analytical methods, and the clinicians, the systems' end users and those ultimately responsible for taking medical decisions. Finally, the different steps in cancer genome analysis are illustrated through examples of applications in cancer genome analysis.

  17. Enhancing genomic laboratory reports: A qualitative analysis of provider review

    Science.gov (United States)

    Rahm, Alanna Kulchak; Stuckey, Heather; Green, Jamie; Feldman, Lynn; Zallen, Doris T.; Bonhag, Michele; Segal, Michael M.; Fan, Audrey L.; Williams, Marc S.

    2016-01-01

    This study reports on the responses of physicians who reviewed provider and patient versions of a genomic laboratory report designed to communicate results of whole genome sequencing. Semi‐structured interviews addressed concept communication, elements, and format of example genome reports. Analysis of the coded transcripts resulted in recognition of three constructs around communication of genome sequencing results: (1) Providers agreed that whole genomic sequencing results are complex and they welcomed a report that provided supportive interpretation information to accompany sequencing results; (2) Providers strongly endorsed a report that included active clinical guidance, such as reference to practice guidelines, if available; and (3) Providers valued the genomic report as a resource that would serve as the basis to facilitate communication of genome sequencing results with their patients and families. Providers valued both versions of the report, though they affirmed the need for a provider‐oriented report. Critical elements of the report included clear language to explain the result, as well as consolidated yet comprehensive prognostic information with clear guidance over time for the clinical care of the patient. Most importantly, it appears a report with this design has the potential not only to return results but also serves as a communication tool to help providers and patients discuss and coordinate care over time. © 2016 The Authors. American Journal of Medical Genetics Part A published by Wiley Periodicals, Inc. PMID:26842872

  18. Progressive Damage Analysis of Bonded Composite Joints

    Science.gov (United States)

    Leone, Frank A., Jr.; Girolamo, Donato; Davila, Carlos G.

    2012-01-01

    The present work is related to the development and application of progressive damage modeling techniques to bonded joint technology. The joint designs studied in this work include a conventional composite splice joint and a NASA-patented durable redundant joint. Both designs involve honeycomb sandwich structures with carbon/epoxy facesheets joined using adhesively bonded doublers.Progressive damage modeling allows for the prediction of the initiation and evolution of damage within a structure. For structures that include multiple material systems, such as the joint designs under consideration, the number of potential failure mechanisms that must be accounted for drastically increases the complexity of the analyses. Potential failure mechanisms include fiber fracture, intraply matrix cracking, delamination, core crushing, adhesive failure, and their interactions. The bonded joints were modeled using highly parametric, explicitly solved finite element models, with damage modeling implemented via custom user-written subroutines. Each ply was discretely meshed using three-dimensional solid elements. Layers of cohesive elements were included between each ply to account for the possibility of delaminations and were used to model the adhesive layers forming the joint. Good correlation with experimental results was achieved both in terms of load-displacement history and the predicted failure mechanism(s).

  19. Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions

    Directory of Open Access Journals (Sweden)

    Villegas Andre

    2010-09-01

    Full Text Available Abstract Background The pan-genome of a bacterial species consists of a core and an accessory gene pool. The accessory genome is thought to be an important source of genetic variability in bacterial populations and is gained through lateral gene transfer, allowing subpopulations of bacteria to better adapt to specific niches. Low-cost and high-throughput sequencing platforms have created an exponential increase in genome sequence data and an opportunity to study the pan-genomes of many bacterial species. In this study, we describe a new online pan-genome sequence analysis program, Panseq. Results Panseq was used to identify Escherichia coli O157:H7 and E. coli K-12 genomic islands. Within a population of 60 E. coli O157:H7 strains, the existence of 65 accessory genomic regions identified by Panseq analysis was confirmed by PCR. The accessory genome and binary presence/absence data, and core genome and single nucleotide polymorphisms (SNPs of six L. monocytogenes strains were extracted with Panseq and hierarchically clustered and visualized. The nucleotide core and binary accessory data were also used to construct maximum parsimony (MP trees, which were compared to the MP tree generated by multi-locus sequence typing (MLST. The topology of the accessory and core trees was identical but differed from the tree produced using seven MLST loci. The Loci Selector module found the most variable and discriminatory combinations of four loci within a 100 loci set among 10 strains in 1 s, compared to the 449 s required to exhaustively search for all possible combinations; it also found the most discriminatory 20 loci from a 96 loci E. coli O157:H7 SNP dataset. Conclusion Panseq determines the core and accessory regions among a collection of genomic sequences based on user-defined parameters. It readily extracts regions unique to a genome or group of genomes, identifies SNPs within shared core genomic regions, constructs files for use in phylogeny programs

  20. Analysis of the genomic homologous recombination in Theilovirus based on complete genomes

    Directory of Open Access Journals (Sweden)

    Yi Maoli

    2011-09-01

    Full Text Available Abstract At present, Theilovirus is considered to comprise four distinct serotypes, including Theiler's murine encephalomyelitis virus, Vilyuisk human encephalomyelitis virus, Thera virus, and Saffold virus. So far, there is no systematical study that investigated the genomic recombination of Theilovirus. The present study performed the phylogenetic and recombination analysis of Theilovirus over the complete genomes. Seven potentially significant recombination events were identified. However, according to the strains information and references related to the recombinants and their parental strains, four of the recombination events might happen non-naturally. These results will provide valuable hints for future research on evolution and antigenic variability of Theilovirus.

  1. Analysis of the genomic homologous recombination in Theilovirus based on complete genomes.

    Science.gov (United States)

    Sun, Guangming; Zhang, Xiaodan; Yi, Maoli; Shao, Shihe; Zhang, Wen

    2011-09-17

    At present, Theilovirus is considered to comprise four distinct serotypes, including Theiler's murine encephalomyelitis virus, Vilyuisk human encephalomyelitis virus, Thera virus, and Saffold virus. So far, there is no systematical study that investigated the genomic recombination of Theilovirus. The present study performed the phylogenetic and recombination analysis of Theilovirus over the complete genomes. Seven potentially significant recombination events were identified. However, according to the strains information and references related to the recombinants and their parental strains, four of the recombination events might happen non-naturally. These results will provide valuable hints for future research on evolution and antigenic variability of Theilovirus.

  2. Cytogenetic analysis from DNA by comparative genomic hybridization.

    Science.gov (United States)

    Tachdjian, G; Aboura, A; Lapierre, J M; Viguié, F

    2000-01-01

    Comparative genomic hybridization (CGH) is a modified in situ hybridization technique which allows detection and mapping of DNA sequence copy differences between two genomes in a single experiment. In CGH analysis, two differentially labelled genomic DNA (study and reference) are co-hybridized to normal metaphase spreads. Chromosomal locations of copy number changes in the DNA segments of the study genome are revealed by a variable fluorescence intensity ratio along each target chromosome. Since its development, CGH has been applied mostly as a research tool in the field of cancer cytogenetics to identify genetic changes in many previously unknown regions. CGH may also have a role in clinical cytogenetics for detection and identification of unbalanced chromosomal abnormalities.

  3. Genome wide copy number analysis of single cells

    Science.gov (United States)

    Baslan, Timour; Kendall, Jude; Rodgers, Linda; Cox, Hilary; Riggs, Mike; Stepansky, Asya; Troge, Jennifer; Ravi, Kandasamy; Esposito, Diane; Lakshmi, B.; Wigler, Michael; Navin, Nicholas; Hicks, James

    2016-01-01

    Summary Copy number variation (CNV) is increasingly recognized as an important contributor to phenotypic variation in health and disease. Most methods for determining CNV rely on admixtures of cells, where information regarding genetic heterogeneity is lost. Here, we present a protocol that allows for the genome wide copy number analysis of single nuclei isolated from mixed populations of cells. Single nucleus sequencing (SNS), combines flow sorting of single nuclei based on DNA content, whole genome amplification (WGA), followed by next generation sequencing to quantize genomic intervals in a genome wide manner. Multiplexing of single cells is discussed. Additionally, we outline informatic approaches that correct for biases inherent in the WGA procedure and allow for accurate determination of copy number profiles. All together, the protocol takes ~3 days from flow cytometry to sequence-ready DNA libraries. PMID:22555242

  4. Differential DNA Methylation Analysis without a Reference Genome

    Directory of Open Access Journals (Sweden)

    Johanna Klughammer

    2015-12-01

    Full Text Available Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS, which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish. Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org. The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome.

  5. Differential DNA Methylation Analysis without a Reference Genome.

    Science.gov (United States)

    Klughammer, Johanna; Datlinger, Paul; Printz, Dieter; Sheffield, Nathan C; Farlik, Matthias; Hadler, Johanna; Fritsch, Gerhard; Bock, Christoph

    2015-12-22

    Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS), which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish). Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org). The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome.

  6. Genome-wide functional analysis of human cell-cycle regulators

    Science.gov (United States)

    Mukherji, Mridul; Bell, Russell; Supekova, Lubica; Wang, Yan; Orth, Anthony P.; Batalov, Serge; Miraglia, Loren; Huesken, Dieter; Lange, Joerg; Martin, Christopher; Sahasrabudhe, Sudhir; Reinhardt, Mischa; Natt, Francois; Hall, Jonathan; Mickanin, Craig; Labow, Mark; Chanda, Sumit K.; Cho, Charles Y.; Schultz, Peter G.

    2006-01-01

    Human cells have evolved complex signaling networks to coordinate the cell cycle. A detailed understanding of the global regulation of this fundamental process requires comprehensive identification of the genes and pathways involved in the various stages of cell-cycle progression. To this end, we report a genome-wide analysis of the human cell cycle, cell size, and proliferation by targeting >95% of the protein-coding genes in the human genome using small interfering RNAs (siRNAs). Analysis of >2 million images, acquired by quantitative fluorescence microscopy, showed that depletion of 1,152 genes strongly affected cell-cycle progression. These genes clustered into eight distinct phenotypic categories based on phase of arrest, nuclear area, and nuclear morphology. Phase-specific networks were built by interrogating knowledge-based and physical interaction databases with identified genes. Genome-wide analysis of cell-cycle regulators revealed a number of kinase, phosphatase, and proteolytic proteins and also suggests that processes thought to regulate G1-S phase progression like receptor-mediated signaling, nutrient status, and translation also play important roles in the regulation of G2/M phase transition. Moreover, 15 genes that are integral to TNF/NF-κB signaling were found to regulate G2/M, a previously unanticipated role for this pathway. These analyses provide systems-level insight into both known and novel genes as well as pathways that regulate cell-cycle progression, a number of which may provide new therapeutic approaches for the treatment of cancer. PMID:17001007

  7. Identification of novel genomic markers related to progression to glioblastoma through genomic profiling of 25 primary glioma cell lines.

    NARCIS (Netherlands)

    Roversi, G.; Pfundt, R.; Moroni, R.F.; Magnani, I.; Reijmersdal, S.V. van; Pollo, B.; Straatman, H.M.P.M.; Larizza, L.; Schoenmakers, E.F.P.M.

    2006-01-01

    Identification of genetic copy number changes in glial tumors is of importance in the context of improved/refined diagnostic, prognostic procedures and therapeutic decision-making. In order to detect recurrent genomic copy number changes that might play a role in glioma pathogenesis and/or progressi

  8. What’s in the genome of a filamentous fungus? Analysis of the Neurospora genome sequence

    Science.gov (United States)

    Mannhaupt, Gertrud; Montrone, Corinna; Haase, Dirk; Mewes, H. Werner; Aign, Verena; Hoheisel, Jörg D.; Fartmann, Berthold; Nyakatura, Gerald; Kempken, Frank; Maier, Josef; Schulte, Ulrich

    2003-01-01

    The German Neurospora Genome Project has assembled sequences from ordered cosmid and BAC clones of linkage groups II and V of the genome of Neurospora crassa in 13 and 12 contigs, respectively. Including additional sequences located on other linkage groups a total of 12 Mb were subjected to a manual gene extraction and annotation process. The genome comprises a small number of repetitive elements, a low degree of segmental duplications and very few paralogous genes. The analysis of the 3218 identified open reading frames provides a first overview of the protein equipment of a filamentous fungus. Significantly, N.crassa possesses a large variety of metabolic enzymes including a substantial number of enzymes involved in the degradation of complex substrates as well as secondary metabolism. While several of these enzymes are specific for filamentous fungi many are shared exclusively with prokaryotes. PMID:12655011

  9. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  10. Systems Analysis department. Annual progress report 1997

    Energy Technology Data Exchange (ETDEWEB)

    Larsen, Hans; Olsson, Charlotte; Petersen, Kurt E.

    1998-03-01

    The report describes the work of the Systems Analysis Department at Risoe National Laboratory during 1997. The department is undertaking research within Energy systems Analysis, Integrated Energy, Environment and Development Planning - UNEP Centre, Industrial Safety and Reliability and Man/Machine Interaction. The report includes lists of publications lectures, committees and staff members. (au) 110 refs.

  11. Systems Analysis Department annual progress report 1998

    Energy Technology Data Exchange (ETDEWEB)

    Larsen, Hans; Olsson, Charlotte; Loevborg, Leif [eds.

    1999-03-01

    The report describes the work of the Systems Analysis Department at Risoe National Laboratory during 1998. The department undertakes research within Energy Systems Analysis, Integrated Energy, Environment and Development Planning - UNEP Centre, Industrial Safety and Reliability, Man/Machine Interaction and Technology Scenarios. The report includes lists of publications, lectures, committees and staff members. (au) 111 refs.

  12. Systems Analysis Department. Annual Progress Report 1999

    Energy Technology Data Exchange (ETDEWEB)

    Larsen, Hans; Olsson, Charlotte; Loevborg, Leif [eds.

    2000-03-01

    This report describes the work of the Systems Analysis Department at Risoe National Laboratory during 1999. The department is undertaking research within Energy Systems Analysis, Energy, Environment and Development Planning-UNEP Centre, Safety, Reliability and Human Factors, and Technology Scenarios. The report includes summary statistics and lists of publications, committees and staff members. (au)

  13. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.

  14. Yeast as a touchstone in post-genomic research: strategies for integrative analysis in functional genomics.

    Science.gov (United States)

    Castrillo, Juan I; Oliver, Stephen G

    2004-01-31

    The new complexity arising from the genome sequencing projects requires new comprehensive post-genomic strategies: advanced studies in regulatory mechanisms, application of new high-throughput technologies at a genome-wide scale, at the different levels of cellular complexity (genome, transcriptome, proteome and metabolome), efficient analysis of the results, and application of new bioinformatic methods in an integrative or systems biology perspective. This can be accomplished in studies with model organisms under controlled conditions. In this review a perspective of the favourable characteristics of yeast as a touchstone model in post-genomic research is presented. The state-of-the art, latest advances in the field and bottlenecks, new strategies, new regulatory mechanisms, applications (patents) and high-throughput technologies, most of them being developed and validated in yeast, are presented. The optimal characteristics of yeast as a well-defined system for comprehensive studies under controlled conditions makes it a perfect model to be used in integrative, "systems biology" studies to get new insights into the mechanisms of regulation (regulatory networks) responsible of specific phenotypes under particular environmental conditions, to be applied to more complex organisms (e.g. plants, human).

  15. Sequencing and Analysis of a Genomic Fragment Provide an Insight into the Dunaliella viridis Genomic Sequence

    Institute of Scientific and Technical Information of China (English)

    Xiao-Ming SUN; Yuan-Ping TANG; Xiang-Zong MENG; Wen-Wen ZHANG; Shan LI; Zhi-Rui DENG; Zheng-Kai XU; Ren-Tao SONG

    2006-01-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)n type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features.

  16. Chromosome region-specific libraries for human genome analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kao, Fa-Ten.

    1991-01-01

    We have made important progress since the beginning of the current grant year. We have further developed the microdissection and PCR- assisted microcloning techniques using the linker-adaptor method. We have critically evaluated the microdissection libraries constructed by this microtechnology and proved that they are of high quality. We further demonstrated that these microdissection clones are useful in identifying corresponding YAC clones for a thousand-fold expansion of the genomic coverage and for contig construction. We are also improving the technique of cloning the dissected fragments in test tube by the TDT method. We are applying both of these PCR cloning technique to human chromosomes 2 and 5 to construct region-specific libraries for physical mapping purposes of LLNL and LANL. Finally, we are exploring efficient procedures to use unique sequence microclones to isolate cDNA clones from defined chromosomal regions as valuable resources for identifying expressed gene sequences in the human genome. We believe that we are making important progress under the auspices of this DOE human genome program grant and we will continue to make significant contributions in the coming year. 4 refs., 4 figs.

  17. Hyperstructures, genome analysis and I-cells

    DEFF Research Database (Denmark)

    Amar, P.; Ballet, P.; Barlovatz-Meimon, G.

    2002-01-01

    familiar to biologists. Finally, we speculate on how a variety of in silico approaches involving cellular automata and multi-agent systems could be combined to develop new concepts in the form of an Integrated cell (I-cell) which would undergo selection for growth and survival in a world of artificial......New concepts may prove necessary to profit from the avalanche of sequence data on the genome, transcriptome, proteome and interactome and to relate this information to cell physiology. Here, we focus on the concept of large activity-based structures, or hyperstructures, in which a variety of types...... of molecules are brought together to perform a function. We review the evidence for the existence of hyperstructures responsible for the initiation of DNA replication, the sequestration of newly replicated origins of replication, cell division and for metabolism. The processes responsible for hyperstructure...

  18. Topological Data Analysis Generates High-Resolution, Genome-wide Maps of Human Recombination.

    Science.gov (United States)

    Camara, Pablo G; Rosenbloom, Daniel I S; Emmett, Kevin J; Levine, Arnold J; Rabadan, Raul

    2016-07-01

    Meiotic recombination is a fundamental evolutionary process driving diversity in eukaryotes. In mammals, recombination is known to occur preferentially at specific genomic regions. Using topological data analysis (TDA), a branch of applied topology that extracts global features from large data sets, we developed an efficient method for mapping recombination at fine scales. When compared to standard linkage-based methods, TDA can deal with a larger number of SNPs and genomes without incurring prohibitive computational costs. We applied TDA to 1,000 Genomes Project data and constructed high-resolution whole-genome recombination maps of seven human populations. Our analysis shows that recombination is generally under-represented within transcription start sites. However, the binding sites of specific transcription factors are enriched for sites of recombination. These include transcription factors that regulate the expression of meiosis- and gametogenesis-specific genes, cell cycle progression, and differentiation blockage. Additionally, our analysis identifies an enrichment for sites of recombination at repeat-derived loci matched by piwi-interacting RNAs.

  19. Progress on Transferring Elite Genes from Non-AA Genome Wild Rice into Oryza sativa through Interspecific Hybridization

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The progress of research on transferring elite genes from non-AA genome wild rice into Oryza sativa through interspecific hybridization are in three respects,that is,breeding monosomic alien addition lines (MAALs),constructing introgression lines (ILs) and analyzing the heredity of the characters and mapping the related genes.There are serious reproductive barriers,mainly incrossability and hybrid sterility,in the interspecific hybridization of O.sativa with non-AA genome wild rice.These are the 'bottleneck' for transferring elite genes from wild rice to O.sativa.Combining traditional crossing method with biotechnique is a reliable way to overcome the reproductive barriers and to improve the utilizing efficiency of non-AA genome wild rice.

  20. Progress in spatial analysis methods and applications

    CERN Document Server

    Páez, Antonio; Buliung, Ron N; Dall'erba, Sandy

    2010-01-01

    This book brings together developments in spatial analysis techniques, including spatial statistics, econometrics, and spatial visualization, and applications to fields such as regional studies, transportation and land use, population and health.

  1. Digital microarray analysis for digital artifact genomics

    Science.gov (United States)

    Jaenisch, Holger; Handley, James; Williams, Deborah

    2013-06-01

    We implement a Spatial Voting (SV) based analogy of microarray analysis for digital gene marker identification in malware code sections. We examine a famous set of malware formally analyzed by Mandiant and code named Advanced Persistent Threat (APT1). APT1 is a Chinese organization formed with specific intent to infiltrate and exploit US resources. Manidant provided a detailed behavior and sting analysis report for the 288 malware samples available. We performed an independent analysis using a new alternative to the traditional dynamic analysis and static analysis we call Spatial Analysis (SA). We perform unsupervised SA on the APT1 originating malware code sections and report our findings. We also show the results of SA performed on some members of the families associated by Manidant. We conclude that SV based SA is a practical fast alternative to dynamics analysis and static analysis.

  2. Advanced technologies for genomic analysis in farm animals and its application for QTL mapping.

    Science.gov (United States)

    Hu, Xiaoxiang; Gao, Yu; Feng, Chungang; Liu, Qiuyue; Wang, Xiaobo; Du, Zhuo; Wang, Qingsong; Li, Ning

    2009-06-01

    Rapid progress in farm animal breeding has been made in the last few decades. Advanced technologies for genomic analysis in molecular genetics have led to the identification of genes or markers associated with genes that affect economic traits. Molecular markers, large-insert libraries and RH panels have been used to build the genetic linkage maps, physical maps and comparative maps in different farm animals. Moreover, EST sequencing, genome sequencing and SNPs maps are helping us to understand how genomes function in various organisms and further areas will be studied by DNA microarray technologies and proteomics methods. Because most economically important traits in farm animals are controlled by multiple genes and the environment, the main goal of genome research in farm animals is to map and characterize genes determining QTL. There are two main strategies to identify trait loci, candidate gene association tests and genome scan approaches. In recent years, some new concepts, such as RNAi, miRNA and eQTL, have been introduced into farm animal research, especially for QTL mapping and finding QTN. Several genes that influence important traits have already been identified or are close to being identified, and some of them have been applied in farm animal breeding programs by marker-assisted selection.

  3. Genomic compositions and phylogenetic analysis of Shigella boydii subgroup

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Comparative Genomic Hybridization (CGH) microarray analysis was used to compare the genomic compositions of all eighteen Shigella boydii serotype representative strains. The results indicated the genomic "backbone" of this subgroup contained 2552 ORFs homologous to nonpathogenic E. coli K12. Compared with the genome of K12199 ORFs were found to be absent in all S. boydii serotype representatives, including mainly outer membrane protein genes and O-antigen biosynthesis genes. Yet the specific ORFs of S. boydii subgroup contained basically bacteriophage genes and the function unknown (FUN) genes. Some iron metabolism, transport and type II secretion system related genes were found in most representative strains. According to the CGH phylogenetic analysis, the eighteen S. boydii serotype representatives were divided into four groups, in which serotype C13 strain was remarkably distinguished from the other serotype strains. This grouping result corresponded to the distribution of some metabolism related genes. Furthermore, the analysis of genome backbone genes, specific genes, and the phylogenetic trees allowed us to discover the evolution laws of S. boydii and to find out important clues to pathogenesis research, vaccination and the therapeutic medicine development.

  4. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  5. Genome-wide identification of the regulatory targets of a transcription factor using biochemical characterization and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Jolly Emmitt R

    2005-11-01

    Full Text Available Abstract Background A major challenge in computational genomics is the development of methodologies that allow accurate genome-wide prediction of the regulatory targets of a transcription factor. We present a method for target identification that combines experimental characterization of binding requirements with computational genomic analysis. Results Our method identified potential target genes of the transcription factor Ndt80, a key transcriptional regulator involved in yeast sporulation, using the combined information of binding affinity, positional distribution, and conservation of the binding sites across multiple species. We have also developed a mathematical approach to compute the false positive rate and the total number of targets in the genome based on the multiple selection criteria. Conclusion We have shown that combining biochemical characterization and computational genomic analysis leads to accurate identification of the genome-wide targets of a transcription factor. The method can be extended to other transcription factors and can complement other genomic approaches to transcriptional regulation.

  6. Stacks: an analysis tool set for population genomics.

    Science.gov (United States)

    Catchen, Julian; Hohenlohe, Paul A; Bassham, Susan; Amores, Angel; Cresko, William A

    2013-06-01

    Massively parallel short-read sequencing technologies, coupled with powerful software platforms, are enabling investigators to analyse tens of thousands of genetic markers. This wealth of data is rapidly expanding and allowing biological questions to be addressed with unprecedented scope and precision. The sizes of the data sets are now posing significant data processing and analysis challenges. Here we describe an extension of the Stacks software package to efficiently use genotype-by-sequencing data for studies of populations of organisms. Stacks now produces core population genomic summary statistics and SNP-by-SNP statistical tests. These statistics can be analysed across a reference genome using a smoothed sliding window. Stacks also now provides several output formats for several commonly used downstream analysis packages. The expanded population genomics functions in Stacks will make it a useful tool to harness the newest generation of massively parallel genotyping data for ecological and evolutionary genetics.

  7. [Research progress in developing reporter systems for the enrichment of positive cells with targeted genome modification].

    Science.gov (United States)

    Bai, Yichun; Xu, Kun; Wei, Zehui; Ma, Zheng; Zhang, Zhiying

    2016-01-01

    Targeted genome editing technology plays an important role in studies of gene function, gene therapy and transgenic breeding. Moreover, the efficiency of targeted genome editing is increased dramatically with the application of recently developed artificial nucleases such as ZFNs, TALENs and CRISPR/Cas9. However, obtaining positive cells with targeted genome modification is restricted to some extent by nucleases expression plasmid transfection efficiency, nucleases expression and activity, and repair efficiency after genome editing. Thus, the enrichment and screening of positive cells with targeted genome modification remains a problem that need to be solved. Surrogate reporter systems could be used to reflect the efficiency of nucleases indirectly and enrich genetically modified positive cells effectively, which may increase the efficiency of the enrichment and screening of positive cells with targeted genome modification. In this review, we mainly summarized principles and applications of reporter systems based on NHEJ and SSA repair mechanisms, which may provide references for related studies in future.

  8. Systems Analysis Department. Annual progress report 1996

    Energy Technology Data Exchange (ETDEWEB)

    Larsen, H.; Olsson, C.; Petersen, K.E. [eds.

    1997-03-01

    The report describes the work of the Systems Analysis Department at Risoe National Laboratory during 1996. The department is undertaking research within Simulation and Optimisation of Energy Systems, Energy and Environment in Developing Countries - UNEP Centre, Integrated Environmental and Risk Management and Man/Machine Interaction. The report includes lists of publications, lectures, committees and staff members. (au) 131 refs.

  9. Genome-wide Studies of Mycolic Acid Bacteria: Computational Identification and Analysis of a Minimal Genome

    KAUST Repository

    Kamanu, Frederick Kinyua

    2012-12-01

    The mycolic acid bacteria are a distinct suprageneric group of asporogenous Grampositive, high GC-content bacteria, distinguished by the presence of mycolic acids in their cell envelope. They exhibit great diversity in their cell and morphology; although primarily non-pathogens, this group contains three major pathogens Mycobacterium leprae, Mycobacterium tuberculosis complex, and Corynebacterium diphtheria. Although the mycolic acid bacteria are a clearly defined group of bacteria, the taxonomic relationships between its constituent genera and species are less well defined. Two approaches were tested for their suitability in describing the taxonomy of the group. First, a Multilocus Sequence Typing (MLST) experiment was assessed and found to be superior to monophyletic (16S small ribosomal subunit) in delineating a total of 52 mycolic acid bacterial species. Phylogenetic inference was performed using the neighbor-joining method. To further refine phylogenetic analysis and to take advantage of the widespread availability of bacterial genome data, a computational framework that simulates DNA-DNA hybridisation was developed and validated using multiscale bootstrap resampling. The tool classifies microbial genomes based on whole genome DNA, and was deployed as a web-application using PHP and Javascript. It is accessible online at http://cbrc.kaust.edu.sa/dna_hybridization/ A third study was a computational and statistical methods in the identification and analysis of a putative minimal mycolic acid bacterial genome so as to better understand (1) the genomic requirements to encode a mycolic acid bacterial cell and (2) the role and type of genes and genetic elements that lead to the massive increase in genome size in environmental mycolic acid bacteria. Using a reciprocal comparison approach, a total of 690 orthologous gene clusters forming a putative minimal genome were identified across 24 mycolic acid bacterial species. In order to identify new potential drug

  10. Functional genomic analysis of C. elegans molting.

    Directory of Open Access Journals (Sweden)

    Alison R Frand

    2005-10-01

    Full Text Available Although the molting cycle is a hallmark of insects and nematodes, neither the endocrine control of molting via size, stage, and nutritional inputs nor the enzymatic mechanism for synthesis and release of the exoskeleton is well understood. Here, we identify endocrine and enzymatic regulators of molting in C. elegans through a genome-wide RNA-interference screen. Products of the 159 genes discovered include annotated transcription factors, secreted peptides, transmembrane proteins, and extracellular matrix enzymes essential for molting. Fusions between several genes and green fluorescent protein show a pulse of expression before each molt in epithelial cells that synthesize the exoskeleton, indicating that the corresponding proteins are made in the correct time and place to regulate molting. We show further that inactivation of particular genes abrogates expression of the green fluorescent protein reporter genes, revealing regulatory networks that might couple the expression of genes essential for molting to endocrine cues. Many molting genes are conserved in parasitic nematodes responsible for human disease, and thus represent attractive targets for pesticide and pharmaceutical development.

  11. Dyneins across eukaryotes: a comparative genomic analysis.

    Science.gov (United States)

    Wickstead, Bill; Gull, Keith

    2007-12-01

    Dyneins are large minus-end-directed microtubule motors. Each dynein contains at least one dynein heavy chain (DHC) and a variable number of intermediate chains (IC), light intermediate chains (LIC) and light chains (LC). Here, we used genome sequence data from 24 diverse eukaryotes to assess the distribution of DHCs, ICs, LICs and LCs across Eukaryota. Phylogenetic inference identified nine DHC families (two cytoplasmic and seven axonemal) and six IC families (one cytoplasmic). We confirm that dyneins have been lost from higher plants and show that this is most likely because of a single loss of cytoplasmic dynein 1 from the ancestor of Rhodophyta and Viridiplantae, followed by lineage-specific losses of other families. Independent losses in Entamoeba mean that at least three extant eukaryotic lineages are entirely devoid of dyneins. Cytoplasmic dynein 2 is associated with intraflagellar transport (IFT), but in two chromalveolate organisms, we find an IFT footprint without the retrograde motor. The distribution of one family of outer-arm dyneins accounts for 2-headed or 3-headed outer-arm ultrastructures observed in different organisms. One diatom species builds motile axonemes without any inner-arm dyneins (IAD), and the unexpected conservation of IAD I1 in non-flagellate algae and LC8 (DYNLL1/2) in all lineages reveals a surprising fluidity to dynein function.

  12. Dynamics and vibrations progress in nonlinear analysis

    CERN Document Server

    Kachapi, Seyed Habibollah Hashemi

    2014-01-01

    Dynamical and vibratory systems are basically an application of mathematics and applied sciences to the solution of real world problems. Before being able to solve real world problems, it is necessary to carefully study dynamical and vibratory systems and solve all available problems in case of linear and nonlinear equations using analytical and numerical methods. It is of great importance to study nonlinearity in dynamics and vibration; because almost all applied processes act nonlinearly, and on the other hand, nonlinear analysis of complex systems is one of the most important and complicated tasks, especially in engineering and applied sciences problems. There are probably a handful of books on nonlinear dynamics and vibrations analysis. Some of these books are written at a fundamental level that may not meet ambitious engineering program requirements. Others are specialized in certain fields of oscillatory systems, including modeling and simulations. In this book, we attempt to strike a balance between th...

  13. Primer to analysis of genomic data using R

    CERN Document Server

    Gondro, Cedric

    2015-01-01

    Through this book, researchers and students will learn to use R for analysis of large-scale genomic data and how to create routines to automate analytical steps. The philosophy behind the book is to start with real world raw datasets and perform all the analytical steps needed to reach final results. Though theory plays an important role, this is a practical book for advanced undergraduate and graduate classes in bioinformatics, genomics and statistical genetics or for use in lab sessions. This book is also designed to be used by students in computer science and statistics who want to learn the practical aspects of genomic analysis without delving into algorithmic details. The datasets used throughout the book may be downloaded from the publisher’s website.  Chapters show how to handle and manage high-throughput genomic data, create automated workflows and speed up analyses in R. A wide range of R packages useful for working with genomic data are illustrated with practical examples. In recent years R has b...

  14. Castor bean organelle genome sequencing and worldwide genetic diversity analysis.

    Directory of Open Access Journals (Sweden)

    Maximo Rivarola

    Full Text Available Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade.

  15. Castor bean organelle genome sequencing and worldwide genetic diversity analysis.

    Science.gov (United States)

    Rivarola, Maximo; Foster, Jeffrey T; Chan, Agnes P; Williams, Amber L; Rice, Danny W; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M J; Khouri, Hoda M; Beckstrom-Sternberg, Stephen M; Allan, Gerard J; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade.

  16. Castor Bean Organelle Genome Sequencing and Worldwide Genetic Diversity Analysis

    Science.gov (United States)

    Chan, Agnes P.; Williams, Amber L.; Rice, Danny W.; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M. J.; Khouri, Hoda M.; Beckstrom-Sternberg, Stephen M.; Allan, Gerard J.; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D.

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade. PMID:21750729

  17. Sequencing and analysis of the giant panda genome

    Institute of Scientific and Technical Information of China (English)

    YANG HuanMing

    2010-01-01

    @@ The giant panda (Ailuropoda melanoleuca) is loved all over the world and is considered a symbol of China, as illustrated by its being one of the mascots for the Beijing 2008 Olympic Games.It is also one of the world's most endangered animals and a flagship species for conservation.Using next-generation sequencing technology (Illumina Genome Analyzer) and our in-house assembly software, we have generated the first map of the giant panda genome sequence.This map will provide an unparalleled amount of information to aid in understanding the genetic and biological nature of this unique species and will contribute significantly to disease control and conservation efforts for this endangered species.In March 2008, the giant panda genome sequencing and analysis project was started at the Beijing Genomics Institute (BGI) in Shenzhen with collaborators from the Kunming Institute of Zoology and the Chengdu Research Base of Giant Panda Breeding.On 21 Jan.2010, this collaboration resulted in the publication, as a cover story in the journal Nature, of the sequencing and analysis of the giant panda genome.

  18. Comparative analysis of methods for genome-wide nucleosome cartography.

    Science.gov (United States)

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  19. Progress on the CWU READI Analysis Center

    Science.gov (United States)

    Melbourne, T. I.; Szeliga, W. M.; Santillan, V. M.; Scrivner, C.

    2015-12-01

    Real-time GPS position streams are desirable for a variety of seismic monitoring and hazard mitigation applications. We report on progress in our development of a comprehensive real-time GPS-based seismic monitoring system for the Cascadia subduction zone. This system is based on 1 Hz point position estimates computed in the ITRF08 reference frame. Convergence from phase and range observables to point position estimates is accelerated using a Kalman filter based, on-line stream editor that produces independent estimations of carrier phase integer biases and other parameters. Positions are then estimated using a short-arc approach and algorithms from JPL's GIPSY-OASIS software with satellite clock and orbit products from the International GNSS Service (IGS). The resulting positions show typical RMS scatter of 2.5 cm in the horizontal and 5 cm in the vertical with latencies below 2 seconds. To facilitate the use of these point position streams for applications such as seismic monitoring, we broadcast real-time positions and covariances using custom-built aggregation-distribution software based on RabbitMQ messaging platform. This software is capable of buffering 24-hour streams for hundreds of stations and providing them through a REST-ful web interface. To demonstrate the power of this approach, we have developed a Java-based front-end that provides a real-time visual display of time-series, displacement vector fields, and map-view, contoured, peak ground displacement. This Java-based front-end is available for download through the PANGA website. We are currently analyzing 80 PBO and PANGA stations along the Cascadia margin and gearing up to process all 400+ real-time stations that are operating in the Pacific Northwest, many of which are currently telemetered in real-time to CWU. These will serve as milestones towards our over-arching goal of extending our processing to include all of the available real-time streams from the Pacific rim. In addition, we have

  20. In vitro analysis of integrated global high-resolution DNA methylation profiling with genomic imbalance and gene expression in osteosarcoma.

    Directory of Open Access Journals (Sweden)

    Bekim Sadikovic

    Full Text Available Genetic and epigenetic changes contribute to deregulation of gene expression and development of human cancer. Changes in DNA methylation are key epigenetic factors regulating gene expression and genomic stability. Recent progress in microarray technologies resulted in developments of high resolution platforms for profiling of genetic, epigenetic and gene expression changes. OS is a pediatric bone tumor with characteristically high level of numerical and structural chromosomal changes. Furthermore, little is known about DNA methylation changes in OS. Our objective was to develop an integrative approach for analysis of high-resolution epigenomic, genomic, and gene expression profiles in order to identify functional epi/genomic differences between OS cell lines and normal human osteoblasts. A combination of Affymetrix Promoter Tilling Arrays for DNA methylation, Agilent array-CGH platform for genomic imbalance and Affymetrix Gene 1.0 platform for gene expression analysis was used. As a result, an integrative high-resolution approach for interrogation of genome-wide tumour-specific changes in DNA methylation was developed. This approach was used to provide the first genomic DNA methylation maps, and to identify and validate genes with aberrant DNA methylation in OS cell lines. This first integrative analysis of global cancer-related changes in DNA methylation, genomic imbalance, and gene expression has provided comprehensive evidence of the cumulative roles of epigenetic and genetic mechanisms in deregulation of gene expression networks.

  1. Genome-wide gene expression analysis of anguillid herpesvirus 1

    NARCIS (Netherlands)

    Beurden, van S.J.; Peeters, B.P.H.; Rottier, P.J.M.; Davison, A.A.; Engelsma, M.Y.

    2013-01-01

    Background Whereas temporal gene expression in mammalian herpesviruses has been studied extensively, little is known about gene expression in fish herpesviruses. Here we report a genome-wide transcription analysis of a fish herpesvirus, anguillid herpesvirus 1, in cell culture, studied during the

  2. Integrated translational genomics for analysis of complex traits in sorghum

    Science.gov (United States)

    We will report on the integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of identifying genes controlling important agronomic traits and tran...

  3. Genome-Wide Association Analysis in Primary Sclerosing Cholangitis

    NARCIS (Netherlands)

    T.H. Karlsen; A. Franke; E. Melum; A.. Kaser; J.R. Hov; T. Balschun; B.A. Lie; A. Bergquist; C. Schramm; T.J. Weismüller; D. Gotthardt; C. Rust; E.E.R. Philipp; T. Fritz; L. Henckaerts; R. Weersma; P. Stokkers; C.Y. Ponsioen; C. Wijmenga; M. Sterneck; M. Nothnagel; J. Hampe; A. Teufel; H. Runz; P. Rosenstiel; A. Stiehl; S. Vermeire; U. Beuers; M. Manns; E. Schrumpf; K.M. Boberg; S. Schreiber

    2010-01-01

    BACKGROUND & AIMS: We aimed to characterize the genetic susceptibility to primary sclerosing cholangitis (PSC) by means of a genome-wide association analysis of single nucleotide polymorphism (SNP) markers. METHODS: A total of 443,816 SNPs on the Affymetrix SNP Array 5.0 (Affymetrix, Santa Clara, CA

  4. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 complex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  5. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  6. Genome bioinformatic analysis of nonsynonymous SNPs

    Directory of Open Access Journals (Sweden)

    Todd John A

    2007-08-01

    Full Text Available Abstract Background Genome-wide association studies of common diseases for common, low penetrance causal variants are underway. A proportion of these will alter protein sequences, the most common of which is the non-synonymous single nucleotide polymorphism (nsSNP. It would be an advantage if the functional effects of an nsSNP on protein structure and function could be predicted, both for the final identification process of a causal variant in a disease-associated chromosome region, and in further functional analyses of the nsSNP and its disease-associated protein. Results In the present report we have compared and contrasted structure- and sequence-based methods of prediction to over 5500 genes carrying nearly 24,000 nsSNPs, by employing an automatic comparative modelling procedure to build models for the genes. The nsSNP information came from two sources, the OMIM database which are rare (minor allele frequency, MAF, 0.05, have no known link to a disease. For over 40% of the nsSNPs, structure-based methods predicted which of these sequence changes are likely to either disrupt the structure of the protein or interfere with the function or interactions of the protein. For the remaining 60%, we generated sequence-based predictions. Conclusion We show that, in general, the prediction tools are able distinguish disease causing mutations from those mutations which are thought to have a neutral affect. We give examples of mutations in genes that are predicted to be deleterious and may have a role in disease. Contrary to previous reports, we also show that rare mutations are consistently predicted to be deleterious as often as commonly occurring nsSNPs.

  7. Sequencing and annotated analysis of an Estonian human genome.

    Science.gov (United States)

    Lilleoja, Rutt; Sarapik, Aili; Reimann, Ene; Reemann, Paula; Jaakma, Ülle; Vasar, Eero; Kõks, Sulev

    2012-02-01

    In present study we describe the sequencing and annotated analysis of the individual genome of Estonian. Using SOLID technology we generated 2,449,441,916 of 50-bp reads. The Bioscope version 1.3 was used for mapping and pairing of reads to the NCBI human genome reference (build 36, hg18). Bioscope enables also the annotation of the results of variant (tertiary) analysis. The average mapping of reads was 75.5% with total coverage of 107.72 Gb. resulting in mean fold coverage of 34.6. We found 3,482,975 SNPs out of which 352,492 were novel. 21,222 SNPs were in coding region: 10,649 were synonymous SNPs, 10,360 were nonsynonymous missense SNPs, 155 were nonsynonymous nonsense SNPs and 58 were nonsynonymous frameshifts. We identified 219 CNVs with total base pair coverage of 37,326,300 bp and 87,451 large insertion/deletion polymorphisms covering 10,152,256 bp of the genome. In addition, we found 285,864 small size insertion/deletion polymorphisms out of which 133,969 were novel. Finally, we identified 53 inversions, 19 overlapped genes and 2 overlapped exons. Interestingly, we found the region in chromosome 6 to be enriched with the coding SNPs and CNVs. This study confirms previous findings, that our genomes are more complex and variable as thought before. Therefore, sequencing of the personal genomes followed by annotation would improve the analysis of heritability of phenotypes and our understandings on the functions of genome.

  8. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  9. Analysis of segmental duplications, mouse genome synteny and recurrent cancer-associated amplicons in human chromosome 6p21-p12.

    Science.gov (United States)

    Martin, J W; Yoshimoto, M; Ludkovski, O; Thorner, P S; Zielenska, M; Squire, J A; Nuin, P A S

    2010-06-01

    It has been proposed that regions of microhomology in the human genome could facilitate genomic rearrangements, copy number transitions, and rapid genomic change during tumor progression. To investigate this idea, this study examines the role of repetitive sequence elements, and corresponding syntenic mouse genomic features, in targeting cancer-associated genomic instability of specific regions of the human genome. Automated database-mining algorithms designed to search for frequent copy number transitions and genomic breakpoints were applied to 2 publicly-available online databases and revealed that 6p21-p12 is one of the regions of the human genome most frequently involved in tumor-specific alterations. In these analyses, 6p21-p12 exhibited the highest frequency of genomic amplification in osteosarcomas. Analysis of repetitive elements in regions of homology between human chromosome 6p and the syntenic regions of the mouse genome revealed a strong association between the location of segmental duplications greater than 5 kilobase-pairs and the position of discontinuities at the end of the syntenic region. The presence of clusters of segmental duplications flanking these syntenic regions also correlated with a high frequency of amplification and genomic alteration. Collectively, the experimental findings, in silico analyses, and comparative genomic studies presented here suggest that segmental duplications may facilitate cancer-associated copy number transitions and rearrangements at chromosome 6p21-p12. This process may involve homology-dependent DNA recombination and/or repair, which may also contribute towards the overall plasticity of the human genome.

  10. Whole-genome sequence-based analysis of thyroid function

    OpenAIRE

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J.; Traglia, Michela; Brown, Suzanne J.; Mullin, Benjamin H; Shihab, Hashem A.; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R.; Beilby, John P.; Charoen, Pimphen

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 1...

  11. Large-scale genomic analysis of ovarian carcinomas.

    Science.gov (United States)

    Gorringe, Kylie L; Campbell, Ian G

    2009-04-01

    Epithelial ovarian cancers are typified by frequent genomic aberrations that have been difficult to unravel. Recently, high-resolution array technologies have provided the first glimpse of the remarkable complexity of these aberrations with some ovarian cancers containing hundreds of copy number breakpoints, micro-deletions and amplifications. Many of these alterations contain cancer-related genes suggesting that the majority is disease-associated and not just the product of random genomic instability. Future developments such as next-generation sequencing and integrated analysis of data from multiple array platforms on large numbers of samples are poised to revolutionize our understanding of this complex disease.

  12. Organic analysis progress report FY 1997

    Energy Technology Data Exchange (ETDEWEB)

    Clauss, S.A.; Grant, K.E.; Hoopes, V.; Mong, G.M.; Steele, R.; Bellofatto, D.; Sharma, A.

    1998-04-01

    The Organic Analysis and Methods Development Task is being conducted by Pacific Northwest National Laboratory (PNNL) as part of the Organic Tank Waste Safety Project. The objective of the task is to apply developed analytical methods to identify and/or quantify the amount of particular organic species in tank wastes. In addition, this task provides analytical support for the Gas Generation Studies Task, Waste Aging, and Solubility Studies. This report presents the results from analyses of tank waste samples archived at Pacific Northwest National Laboratory (PNNL) and received from the Project Hanford Management Contractor (PHMC), which included samples associated with both the Flammable Gas and Organic Tank Waste Safety Programs. The data are discussed in Section 2.0. In addition, the results of analytical support for analyzing (1) simulated wastes for Waste Aging, (2) tank waste samples for Gas Generation, and (3) simulated wastes associated with solubility studies discussed in Sections 3.0, 4.0, and 5.0, respectively. The latter part of FY 1997 was devoted to documenting the analytical procedures, including derivation gas chromatography/mass spectrometry (GC/MS) and GC/FID for quantitation, ion-pair chromatography (IPC), IC, and the cation exchange procedure for reducing the radioactivity of samples. The documentation of analytical procedures is included here and discussed in Section 6.0 and Section 7.0 discusses other analytical procedures. The references are listed in Section 8.0 and future plans are discussed in Section 9.0. Appendix A is a preprint of a manuscript accepted for publication. Appendix B contains the cc mail messages and chain-of-custody forms for the samples received for analyses. Appendix C contains the test plan for analysis of tank waste samples.

  13. Analysis of recent segmental duplications in the bovine genome

    Directory of Open Access Journals (Sweden)

    Li Congjun

    2009-12-01

    Full Text Available Abstract Background Duplicated sequences are an important source of gene innovation and structural variation within mammalian genomes. We performed the first systematic and genome-wide analysis of segmental duplications in the modern domesticated cattle (Bos taurus. Using two distinct computational analyses, we estimated that 3.1% (94.4 Mb of the bovine genome consists of recently duplicated sequences (≥ 1 kb in length, ≥ 90% sequence identity. Similar to other mammalian draft assemblies, almost half (47% of 94.4 Mb of these sequences have not been assigned to cattle chromosomes. Results In this study, we provide the first experimental validation large duplications and briefly compared their distribution on two independent bovine genome assemblies using fluorescent in situ hybridization (FISH. Our analyses suggest that the (75-90% of segmental duplications are organized into local tandem duplication clusters. Along with rodents and carnivores, these results now confidently establish tandem duplications as the most likely mammalian archetypical organization, in contrast to humans and great ape species which show a preponderance of interspersed duplications. A cross-species survey of duplicated genes and gene families indicated that duplication, positive selection and gene conversion have shaped primates, rodents, carnivores and ruminants to different degrees for their speciation and adaptation. We identified that bovine segmental duplications corresponding to genes are significantly enriched for specific biological functions such as immunity, digestion, lactation and reproduction. Conclusion Our results suggest that in most mammalian lineages segmental duplications are organized in a tandem configuration. Segmental duplications remain problematic for genome and assembly and we highlight genic regions that require higher quality sequence characterization. This study provides insights into mammalian genome evolution and generates a valuable

  14. Genome Assembly and Computational Analysis Pipelines for Bacterial Pathogens

    KAUST Repository

    Rangkuti, Farania Gama Ardhina

    2011-06-01

    Pathogens lie behind the deadliest pandemics in history. To date, AIDS pandemic has resulted in more than 25 million fatal cases, while tuberculosis and malaria annually claim more than 2 million lives. Comparative genomic analyses are needed to gain insights into the molecular mechanisms of pathogens, but the abundance of biological data dictates that such studies cannot be performed without the assistance of computational approaches. This explains the significant need for computational pipelines for genome assembly and analyses. The aim of this research is to develop such pipelines. This work utilizes various bioinformatics approaches to analyze the high-­throughput genomic sequence data that has been obtained from several strains of bacterial pathogens. A pipeline has been compiled for quality control for sequencing and assembly, and several protocols have been developed to detect contaminations. Visualization has been generated of genomic data in various formats, in addition to alignment, homology detection and sequence variant detection. We have also implemented a metaheuristic algorithm that significantly improves bacterial genome assemblies compared to other known methods. Experiments on Mycobacterium tuberculosis H37Rv data showed that our method resulted in improvement of N50 value of up to 9697% while consistently maintaining high accuracy, covering around 98% of the published reference genome. Other improvement efforts were also implemented, consisting of iterative local assemblies and iterative correction of contiguated bases. Our result expedites the genomic analysis of virulent genes up to single base pair resolution. It is also applicable to virtually every pathogenic microorganism, propelling further research in the control of and protection from pathogen-­associated diseases.

  15. Numerical Analysis of Structural Progressive Collapse to Blast Loads

    Institute of Scientific and Technical Information of China (English)

    HAO Hong; WU Chengqing; LI Zhongxian; ABDULLAH A K

    2006-01-01

    After the progressive collapse of Ronan Point apartment in UK in 1968,intensive research effort had been spent on developing guidelines for design of new or strengthening the existing structures to prevent progressive collapse.However,only very few building design codes provide some rather general guidance,no detailed design requirement is given.Progressive collapse of the Alfred P.Murrah Federal building in Oklahoma City and the World Trade Centre (WTC) sparked again tremendous research interest on progressive collapse of structures.Recently,US Department of Defence (DoD) and US General Service Administration (GSA) issued guidelines for structure progressive collapse analysis.These two guidelines are most commonly used,but their accuracy is not known.This paper presents numerical analysis of progressive collapse of an example frame structure to blast loads.The DoD and GSA procedures are also used to analyse the same example structure.Numerical results are compared and discussed.The accuracy and the applicability of the two design guidelines are evaluated.

  16. Genome analysis of the platypus reveals unique signatures of evolution.

    Science.gov (United States)

    Warren, Wesley C; Hillier, LaDeana W; Marshall Graves, Jennifer A; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P; Miethke, Pat; Waters, Paul D; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S; López-Otín, Carlos; Ordóñez, Gonzalo R; Eichler, Evan E; Chen, Lin; Cheng, Ze; Deakin, Janine E; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T; Wakefield, Matthew J; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A; Smit, Arian F A; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A; Walker, Jerilyn A; Konkel, Miriam K; Harris, Robert S; Whittington, Camilla M; Wong, Emily S W; Gemmell, Neil J; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R; Ray, David A; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H; Taylor, James; Jones, Russell C; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N; Pohl, Craig S; Smith, Scott M; Hou, Shunfeng; Nefedov, Mikhail; de Jong, Pieter J; Renfree, Marilyn B; Mardis, Elaine R; Wilson, Richard K

    2008-05-08

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.

  17. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  18. Genome analysis of the platypus reveals unique signatures of evolution

    Science.gov (United States)

    Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.

    2009-01-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734

  19. Progress in neutron activation analysis for uranium

    Institute of Scientific and Technical Information of China (English)

    杜鸿善; 李贵群; 董桂芝; 李俊兰; K.H.Chiu; C.M.Wai

    1996-01-01

    A new type of extractant, sym-dibenzo-16-crown-5-oxyhydroxamic acid (HL) is introduced. The extractions of UO22+, Na+, K+, Sr2+, Ba2+ and Br- were studied with HL in chloroform. The results obtained show that UO22+ can be quantitatively extracted at pH values above 5, whereas the extractions of K+, Na+, Sr2+, Ba2+ and Br- are negligible in the pH range of 2 - 7. The dependence of the distribution ratio of U(VI) on both the concentration of the HL and pH are linear, and they have the same slope of 2. This suggests that U(VI) appears to form a 1:2 complex with ligand. Uranium(VI) can be selectively separated and concentrated from interfering elements such as Na, K, Sr and Br by solvent extraction with HL under specific conditions. The recovery of uranium is nearly 100% and the radionudear purity of uranium is greater than 99.99%. Therefore, neutron activation analysis has greatly improved the sensitivity and accuracy for the detection of trace uranium from seawater.

  20. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Science.gov (United States)

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp.

  1. A comprehensive analysis of bilaterian mitochondrial genomes and phylogeny.

    Science.gov (United States)

    Bernt, Matthias; Bleidorn, Christoph; Braband, Anke; Dambach, Johannes; Donath, Alexander; Fritzsch, Guido; Golombek, Anja; Hadrys, Heike; Jühling, Frank; Meusemann, Karen; Middendorf, Martin; Misof, Bernhard; Perseke, Marleen; Podsiadlowski, Lars; von Reumont, Björn; Schierwater, Bernd; Schlegel, Martin; Schrödl, Michael; Simon, Sabrina; Stadler, Peter F; Stöger, Isabella; Struck, Torsten H

    2013-11-01

    About 2800 mitochondrial genomes of Metazoa are present in NCBI RefSeq today, two thirds belonging to vertebrates. Metazoan phylogeny was recently challenged by large scale EST approaches (phylogenomics), stabilizing classical nodes while simultaneously supporting new sister group hypotheses. The use of mitochondrial data in deep phylogeny analyses was often criticized because of high substitution rates on nucleotides, large differences in amino acid substitution rate between taxa, and biases in nucleotide frequencies. Nevertheless, mitochondrial genome data might still be promising as it allows for a larger taxon sampling, while presenting a smaller amount of sequence information. We present the most comprehensive analysis of bilaterian relationships based on mitochondrial genome data. The analyzed data set comprises more than 650 mitochondrial genomes that have been chosen to represent a profound sample of the phylogenetic as well as sequence diversity. The results are based on high quality amino acid alignments obtained from a complete reannotation of the mitogenomic sequences from NCBI RefSeq database. However, the results failed to give support for many otherwise undisputed high-ranking taxa, like Mollusca, Hexapoda, Arthropoda, and suffer from extreme long branches of Nematoda, Platyhelminthes, and some other taxa. In order to identify the sources of misleading phylogenetic signals, we discuss several problems associated with mitochondrial genome data sets, e.g. the nucleotide and amino acid landscapes and a strong correlation of gene rearrangements with long branches.

  2. The Chlamydia psittaci genome: a comparative analysis of intracellular pathogens.

    Directory of Open Access Journals (Sweden)

    Anja Voigt

    Full Text Available BACKGROUND: Chlamydiaceae are a family of obligate intracellular pathogens causing a wide range of diseases in animals and humans, and facing unique evolutionary constraints not encountered by free-living prokaryotes. To investigate genomic aspects of infection, virulence and host preference we have sequenced Chlamydia psittaci, the pathogenic agent of ornithosis. RESULTS: A comparison of the genome of the avian Chlamydia psittaci isolate 6BC with the genomes of other chlamydial species, C. trachomatis, C. muridarum, C. pneumoniae, C. abortus, C. felis and C. caviae, revealed a high level of sequence conservation and synteny across taxa, with the major exception of the human pathogen C. trachomatis. Important differences manifest in the polymorphic membrane protein family specific for the Chlamydiae and in the highly variable chlamydial plasticity zone. We identified a number of psittaci-specific polymorphic membrane proteins of the G family that may be related to differences in host-range and/or virulence as compared to closely related Chlamydiaceae. We calculated non-synonymous to synonymous substitution rate ratios for pairs of orthologous genes to identify putative targets of adaptive evolution and predicted type III secreted effector proteins. CONCLUSIONS: This study is the first detailed analysis of the Chlamydia psittaci genome sequence. It provides insights in the genome architecture of C. psittaci and proposes a number of novel candidate genes mostly of yet unknown function that may be important for pathogen-host interactions.

  3. Viral genome analysis and knowledge management.

    Science.gov (United States)

    Kuiken, Carla; Yoon, Hyejin; Abfalterer, Werner; Gaschen, Brian; Lo, Chienchi; Korber, Bette

    2013-01-01

    One of the challenges of genetic data analysis is to combine information from sources that are distributed around the world and accessible through a wide array of different methods and interfaces. The HIV database and its footsteps, the hepatitis C virus (HCV) and hemorrhagic fever virus (HFV) databases, have made it their mission to make different data types easily available to their users. This involves a large amount of behind-the-scenes processing, including quality control and analysis of the sequences and their annotation. Gene and protein sequences are distilled from the sequences that are stored in GenBank; to this end, both submitter annotation and script-generated sequences are used. Alignments of both nucleotide and amino acid sequences are generated, manually curated, distilled into an alignment model, and regenerated in an iterative cycle that results in ever better new alignments. Annotation of epidemiological and clinical information is parsed, checked, and added to the database. User interfaces are updated, and new interfaces are added based upon user requests. Vital for its success, the database staff are heavy users of the system, which enables them to fix bugs and find opportunities for improvement. In this chapter we describe some of the infrastructure that keeps these heavily used analysis platforms alive and vital after nearly 25 years of use. The database/analysis platforms described in this chapter can be accessed at http://hiv.lanl.gov http://hcv.lanl.gov http://hfv.lanl.gov.

  4. Sequence analysis reveals mosaic genome of Aichi virus

    Directory of Open Access Journals (Sweden)

    Han Xiaohong

    2011-08-01

    Full Text Available Abstract Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity. Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood. The genome of Aichi virus contains 8,280 nucleotides and a poly(A tail. The single large open reading frame (nt 713-8014 according to the strain AB010145 encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  5. Risk factors for progressive ischemic stroke A retrospective analysis

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    BACKGROUND: Progressive ischemic stroke has higher fatality rate and disability rate than common cerebral infarction, thus it is very significant to investigate the early predicting factors related to the occurrence of progressive ischemic stroke, thc potential pathological mechanism and the risk factors of early intervention for preventing the occurrence of progressive ischemic stroke and ameliorating its outcome.OBJECTIVE: To analyze the possible related risk factors in patients with progressive ishcemic stroke, so as to provide reference for the prevention and treatment of progressive ishcemic stroke.DESIGN: A retrospective analysis.SETTING: Department of Neurology, General Hospital of Beijing Coal Mining Group.PARTICIPANTS: Totally 280 patients with progressive ischemic stroke were selected from the Department of Neurology, General Hospital of Beijing Coal Mining Group from March 2002 to June 2006, including 192 males and 88 females, with a mean age of (62±7) years old. They were all accorded with the diagnostic standards for cerebral infarction set by the Fourth National Academic Meeting for Cerebrovascular Disease in 1995, and confired by CT or MRI, admitted within 24 hours after attack, and the neurological defect progressed gradually or aggravated in gradients within 72 hours after attack, and the aggravation of neurological defect was defined as the neurological deficit score decreased by more than 2 points. Meanwhile,200 inpatients with non-progressive ischemic stroke (135 males and 65 females) were selected as the control group.METHODS: After admission, a univariate analysis of variance was conducted using the factors of blood pressure, history of diabetes mellitus, fever, leukocytosis, levels of blood lipids, fibrinogen, blood glucose and plasma homocysteine, cerebral arterial stenosis, and CT symptoms of early infarction, and the significant factors were involved in the multivariate non-conditional Logistic regression analysis.MAIN OUTCOME MEASURES

  6. Genome-wide microarray expression and genomic alterations by array-CGH analysis in neuroblastoma stem-like cells.

    Directory of Open Access Journals (Sweden)

    Raquel Ordóñez

    Full Text Available Neuroblastoma has a very diverse clinical behaviour: from spontaneous regression to a very aggressive malignant progression and resistance to chemotherapy. This heterogeneous clinical behaviour might be due to the existence of Cancer Stem Cells (CSC, a subpopulation within the tumor with stem-like cell properties: a significant proliferation capacity, a unique self-renewal capacity, and therefore, a higher ability to form new tumors. We enriched the CSC-like cell population content of two commercial neuroblastoma cell lines by the use of conditioned cell culture media for neurospheres, and compared genomic gains and losses and genome expression by array-CGH and microarray analysis, respectively (in CSC-like versus standard tumor cells culture. Despite the array-CGH did not show significant differences between standard and CSC-like in both analyzed cell lines, the microarray expression analysis highlighted some of the most relevant biological processes and molecular functions that might be responsible for the CSC-like phenotype. Some signalling pathways detected seem to be involved in self-renewal of normal tissues (Wnt, Notch, Hh and TGF-β and contribute to CSC phenotype. We focused on the aberrant activation of TGF-β and Hh signalling pathways, confirming the inhibition of repressors of TGF-β pathway, as SMAD6 and SMAD7 by RT-qPCR. The analysis of the Sonic Hedgehog pathway showed overexpression of PTCH1, GLI1 and SMO. We found overexpression of CD133 and CD15 in SIMA neurospheres, confirming that this cell line was particularly enriched in stem-like cells. This work shows a cross-talk among different pathways in neuroblastoma and its importance in CSC-like cells.

  7. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    IP-seq and small RNA-seq, we delineated the landscape of the promoters with bidirectional transcriptions that yield steady-state RNA in only one directions (Paper III). A subsequent motif analysis enabled us to uncover specific DNA signals – early polyA sites – that make RNA on the reverse strand sensitive...... they regulated or if the sites had global elevated usage rates by multiple TFs. Using RNA-seq, 5’end-seq in combination with depletion of 5’exonuclease as well as nonsensemediated decay (NMD) factors, we systematically analyzed NMD substrates as well as their degradation intermediates in human cells (Paper V......). Gene enrichment analysis on the detected NMD substrates revealed an unappreciated NMD-based regulatory mechanism of the genes hosting multiple intronic snoRNAs, which can facilitate differential expression of individual snoRNAs from a single host gene locus. Finally, supported by RNA-seq and small RNA-seq...

  8. Sequencing and Analysis of Neanderthal Genomic DNA

    OpenAIRE

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo, Svante; Pritchard, Jonathan K; Rubin, Edward M.

    2006-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library a...

  9. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  10. Genome size determination in peronosporales (Oomycota) by Feulgen image analysis.

    Science.gov (United States)

    Voglmayr, H; Greilhuber, J

    1998-12-01

    Genome size was determined, by nuclear Feulgen staining and image analysis, in 46 accessions of 31 species of Peronosporales (Oomycota), including important plant pathogens such as Bremia lactucae, Plasmopara viticola, Pseudoperonospora cubensis, and Pseudoperonospora humuli. The 1C DNA contents ranged from 0.046 (45. 6 Mb) to 0.163 pg (159.9 Mb). This is 0.041- to 0.144-fold that of Glycine max (soybean, 1C = 1.134 pg), which was used as an internal standard for genome size determination. The linearity of Feulgen absorbance photometry method over this range was demonstrated by calibration of Aspergillus species (1C = 31-38 Mb) against Glycine, which revealed differences of less than 6% compared to the published CHEF data. The low coefficients of variation (usually between 5 and 10%), repeatability of the results, and compatibility with CHEF data prove the resolution power of Feulgen image analysis. The applicability and limitations of Feulgen photometry are discussed in relation to other methods of genome size determination (CHEF gel electrophoresis, reassociation kinetics, genomic reconstruction) that have been previously applied to Oomycota. Copyright 1998 Academic Press.

  11. Natural selection on functional modules, a genome-wide analysis.

    Science.gov (United States)

    Serra, François; Arbiza, Leonardo; Dopazo, Joaquín; Dopazo, Hernán

    2011-03-01

    Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA), a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.

  12. Natural selection on functional modules, a genome-wide analysis.

    Directory of Open Access Journals (Sweden)

    François Serra

    2011-03-01

    Full Text Available Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA, a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.

  13. Genome-Wide Detection and Analysis of Multifunctional Genes

    Science.gov (United States)

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  14. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

    Directory of Open Access Journals (Sweden)

    Feltus Frank A

    2011-07-01

    Full Text Available Abstract Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18 to duodecaploid (12X = 108. Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective. Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of

  15. Reactive oxygen species, DNA damage, and error-prone repair: a model for genomic instability with progression in myeloid leukemia?

    Science.gov (United States)

    Rassool, Feyruz V; Gaymes, Terry J; Omidvar, Nader; Brady, Nicola; Beurlet, Stephanie; Pla, Marika; Reboul, Murielle; Lea, Nicholas; Chomienne, Christine; Thomas, Nicholas S B; Mufti, Ghulam J; Padua, Rose Ann

    2007-09-15

    Myelodysplastic syndromes (MDS) comprise a heterogeneous group of disorders characterized by ineffective hematopoiesis, with an increased propensity to develop acute myelogenous leukemia (AML). The molecular basis for MDS progression is unknown, but a key element in MDS disease progression is loss of chromosomal material (genomic instability). Using our two-step mouse model for myeloid leukemic disease progression involving overexpression of human mutant NRAS and BCL2 genes, we show that there is a stepwise increase in the frequency of DNA damage leading to an increased frequency of error-prone repair of double-strand breaks (DSB) by nonhomologous end-joining. There is a concomitant increase in reactive oxygen species (ROS) in these transgenic mice with disease progression. Importantly, RAC1, an essential component of the ROS-producing NADPH oxidase, is downstream of RAS, and we show that ROS production in NRAS/BCL2 mice is in part dependent on RAC1 activity. DNA damage and error-prone repair can be decreased or reversed in vivo by N-acetyl cysteine antioxidant treatment. Our data link gene abnormalities to constitutive DNA damage and increased DSB repair errors in vivo and provide a mechanism for an increase in the error rate of DNA repair with MDS disease progression. These data suggest treatment strategies that target RAS/RAC pathways and ROS production in human MDS/AML.

  16. Comparative Genome Analysis of Basidiomycete Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  17. Integrative Genomic Analysis of Complex traits

    DEFF Research Database (Denmark)

    Ehsani, Ali Reza

    In the last decade rapid development in biotechnologies has made it possible to extract extensive information about practically all levels of biological organization. An ever-increasing number of studies are reporting miltilayered datasets on the entire DNA sequence, transceroption, protein...... expression, and metabolite abundance of more and more populations in a multitude of invironments. However, a solid model for including all of this complex information in one analysis, to disentangle genetic variation and the underlying genetic architecture of complex traits and diseases, has not yet been...... proposed. This thesis introduced a novel way to integrate such huge data sets in an efficient and informative procedure to dissect the comæexity of obesity related traits (e.g. body wight, body fat, feed intake, etc) and map the flow from DNA through RNA ending with individual phenotypes....

  18. Genomic analysis of mouse retinal development.

    Directory of Open Access Journals (Sweden)

    Seth Blackshaw

    2004-09-01

    Full Text Available The vertebrate retina is comprised of seven major cell types that are generated in overlapping but well-defined intervals. To identify genes that might regulate retinal development, gene expression in the developing retina was profiled at multiple time points using serial analysis of gene expression (SAGE. The expression patterns of 1,051 genes that showed developmentally dynamic expression by SAGE were investigated using in situ hybridization. A molecular atlas of gene expression in the developing and mature retina was thereby constructed, along with a taxonomic classification of developmental gene expression patterns. Genes were identified that label both temporal and spatial subsets of mitotic progenitor cells. For each developing and mature major retinal cell type, genes selectively expressed in that cell type were identified. The gene expression profiles of retinal Müller glia and mitotic progenitor cells were found to be highly similar, suggesting that Müller glia might serve to produce multiple retinal cell types under the right conditions. In addition, multiple transcripts that were evolutionarily conserved that did not appear to encode open reading frames of more than 100 amino acids in length ("noncoding RNAs" were found to be dynamically and specifically expressed in developing and mature retinal cell types. Finally, many photoreceptor-enriched genes that mapped to chromosomal intervals containing retinal disease genes were identified. These data serve as a starting point for functional investigations of the roles of these genes in retinal development and physiology.

  19. The use of whole genome amplification to study chromosomal changes in prostate cancer: insights into genome-wide signature of preneoplasia associated with cancer progression

    Directory of Open Access Journals (Sweden)

    Squire Jeremy A

    2006-03-01

    Full Text Available Abstract Background Prostate cancer (CaP is a disease with multifactorial etiology that includes both genetic and environmental components. The knowledge of the genetic basis of CaP has increased over the past years, mainly in the pathways that underlie tumourigenesis, progression and drug resistance. The vast majority of cases of CaP are adenocarcinomas that likely develop through a pre-malignant lesion and high-grade prostatic intraepithelial neoplasia (HPIN. Histologically, CaP is a heterogeneous disease consisting of multiple, discrete foci of invasive carcinoma and HPIN that are commonly interspersed with benign glands and stroma. This admixture with benign tissue can complicate genomic analyses in CaP. Specifically, when DNA is bulk-extracted the genetic information obtained represents an average for all of the cells within the sample. Results To minimize this problem, we obtained DNA from individual foci of HPIN and CaP by laser capture microdissection (LCM. The small quantities of DNA thus obtained were then amplified by means of multiple-displacement amplification (MDA, for use in genomic DNA array comparative genomic hybridisation (gaCGH. Recurrent chromosome copy number abnormalities (CNAs were observed in both HPIN and CaP. In HPIN, chromosomal imbalances involving chromosome 8 where common, whilst in CaP additional chromosomal changes involving chromosomes 6, 10, 13 and 16 where also frequently observed. Conclusion An overall increase in chromosomal changes was seen in CaP compared to HPIN, suggesting a universal breakdown in chromosomal stability. The accumulation of CNAs, which occurs during this process is non-random and may indicate chromosomal regions important in tumourigenesis. It is therefore likely that the alterations in copy number are part of a programmed cycle of events that promote tumour development, progression and survival. The combination of LCM, MDA and gaCGH is ideally suited for the identification of CNAs from

  20. SIDEKICK: Genomic data driven analysis and decision-making framework

    Directory of Open Access Journals (Sweden)

    Yoon Kihoon

    2010-12-01

    Full Text Available Abstract Background Scientists striving to unlock mysteries within complex biological systems face myriad barriers in effectively integrating available information to enhance their understanding. While experimental techniques and available data sources are rapidly evolving, useful information is dispersed across a variety of sources, and sources of the same information often do not use the same format or nomenclature. To harness these expanding resources, scientists need tools that bridge nomenclature differences and allow them to integrate, organize, and evaluate the quality of information without extensive computation. Results Sidekick, a genomic data driven analysis and decision making framework, is a web-based tool that provides a user-friendly intuitive solution to the problem of information inaccessibility. Sidekick enables scientists without training in computation and data management to pursue answers to research questions like "What are the mechanisms for disease X" or "Does the set of genes associated with disease X also influence other diseases." Sidekick enables the process of combining heterogeneous data, finding and maintaining the most up-to-date data, evaluating data sources, quantifying confidence in results based on evidence, and managing the multi-step research tasks needed to answer these questions. We demonstrate Sidekick's effectiveness by showing how to accomplish a complex published analysis in a fraction of the original time with no computational effort using Sidekick. Conclusions Sidekick is an easy-to-use web-based tool that organizes and facilitates complex genomic research, allowing scientists to explore genomic relationships and formulate hypotheses without computational effort. Possible analysis steps include gene list discovery, gene-pair list discovery, various enrichments for both types of lists, and convenient list manipulation. Further, Sidekick's ability to characterize pairs of genes offers new ways to

  1. Ensemble analysis of adaptive compressed genome sequencing strategies

    Science.gov (United States)

    2014-01-01

    Background Acquiring genomes at single-cell resolution has many applications such as in the study of microbiota. However, deep sequencing and assembly of all of millions of cells in a sample is prohibitively costly. A property that can come to rescue is that deep sequencing of every cell should not be necessary to capture all distinct genomes, as the majority of cells are biological replicates. Biologically important samples are often sparse in that sense. In this paper, we propose an adaptive compressed method, also known as distilled sensing, to capture all distinct genomes in a sparse microbial community with reduced sequencing effort. As opposed to group testing in which the number of distinct events is often constant and sparsity is equivalent to rarity of an event, sparsity in our case means scarcity of distinct events in comparison to the data size. Previously, we introduced the problem and proposed a distilled sensing solution based on the breadth first search strategy. We simulated the whole process which constrained our ability to study the behavior of the algorithm for the entire ensemble due to its computational intensity. Results In this paper, we modify our previous breadth first search strategy and introduce the depth first search strategy. Instead of simulating the entire process, which is intractable for a large number of experiments, we provide a dynamic programming algorithm to analyze the behavior of the method for the entire ensemble. The ensemble analysis algorithm recursively calculates the probability of capturing every distinct genome and also the expected total sequenced nucleotides for a given population profile. Our results suggest that the expected total sequenced nucleotides grows proportional to log of the number of cells and proportional linearly with the number of distinct genomes. The probability of missing a genome depends on its abundance and the ratio of its size over the maximum genome size in the sample. The modified resource

  2. Pan-Genome Analysis of Brazilian Lineage A Amoebal Mimiviruses

    Directory of Open Access Journals (Sweden)

    Felipe L. Assis

    2015-06-01

    Full Text Available Since the recent discovery of Samba virus, the first representative of the family Mimiviridae from Brazil, prospecting for mimiviruses has been conducted in different environmental conditions in Brazil. Recently, we isolated using Acanthamoeba sp. three new mimiviruses, all of lineage A of amoebal mimiviruses: Kroon virus from urban lake water; Amazonia virus from the Brazilian Amazon river; and Oyster virus from farmed oysters. The aims of this work were to sequence and analyze the genome of these new Brazilian mimiviruses (mimi-BR and update the analysis of the Samba virus genome. The genomes of Samba virus, Amazonia virus and Oyster virus were 97%–99% similar, whereas Kroon virus had a low similarity (90%–91% with other mimi-BR. A total of 3877 proteins encoded by mimi-BR were grouped into 974 orthologous clusters. In addition, we identified three new ORFans in the Kroon virus genome. Additional work is needed to expand our knowledge of the diversity of mimiviruses from Brazil, including if and why among amoebal mimiviruses those of lineage A predominate in the Brazilian environment.

  3. Analysis of the core genome and pangenome of Pseudomonas putida.

    Science.gov (United States)

    Udaondo, Zulema; Molina, Lázaro; Segura, Ana; Duque, Estrella; Ramos, Juan L

    2016-10-01

    Pseudomonas putida are strict aerobes that proliferate in a range of temperate niches and are of interest for environmental applications due to their capacity to degrade pollutants and ability to promote plant growth. Furthermore solvent-tolerant strains are useful for biosynthesis of added-value chemicals. We present a comprehensive comparative analysis of nine strains and the first characterization of the Pseudomonas putida pangenome. The core genome of P. putida comprises approximately 3386 genes. The most abundant genes within the core genome are those that encode nutrient transporters. Other conserved genes include those for central carbon metabolism through the Entner-Doudoroff pathway, the pentose phosphate cycle, arginine and proline metabolism, and pathways for degradation of aromatic chemicals. Genes that encode transporters, enzymes and regulators for amino acid metabolism (synthesis and degradation) are all part of the core genome, as well as various electron transporters, which enable aerobic metabolism under different oxygen regimes. Within the core genome are 30 genes for flagella biosynthesis and 12 key genes for biofilm formation. Pseudomonas putida strains share 85% of the coding regions with Pseudomonas aeruginosa; however, in P. putida, virulence factors such as exotoxins and type III secretion systems are absent.

  4. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Directory of Open Access Journals (Sweden)

    Seyhan Yazar

    Full Text Available A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR on Amazon EC2 instances and Google Compute Engine (GCE, using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2 for E.coli and 53.5% (95% CI: 34.4-72.6 for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1 and 173.9% (95% CI: 134.6-213.1 more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  5. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    Science.gov (United States)

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

  6. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Science.gov (United States)

    Yazar, Seyhan; Gooden, George E C; Mackey, David A; Hewitt, Alex W

    2014-01-01

    A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR) on Amazon EC2 instances and Google Compute Engine (GCE), using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome) and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2) for E.coli and 53.5% (95% CI: 34.4-72.6) for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1) and 173.9% (95% CI: 134.6-213.1) more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  7. A GeneTrek analysis of the maize genome.

    Science.gov (United States)

    Liu, Renyi; Vitte, Clémentine; Ma, Jianxin; Mahama, A Assibi; Dhliwayo, Thanda; Lee, Michael; Bennetzen, Jeffrey L

    2007-07-10

    Analysis of the sequences of 74 randomly selected BACs demonstrated that the maize nuclear genome contains approximately 37,000 candidate genes with homologues in other plant species. An additional approximately 5,500 predicted genes are severely truncated and probably pseudogenes. The distribution of genes is uneven, with approximately 30% of BACs containing no genes. BAC gene density varies from 0 to 7.9 per 100 kb, whereas most gene islands contain only one gene. The average number of genes per gene island is 1.7. Only 72% of these genes show collinearity with the rice genome. Particular LTR retrotransposon families (e.g., Gyma) are enriched on gene-free BACs, most of which do not come from pericentromeres or other large heterochromatic regions. Gene-containing BACs are relatively enriched in different families of LTR retrotransposons (e.g., Ji). Two major bursts of LTR retrotransposon activity in the last 2 million years are responsible for the large size of the maize genome, but only the more recent of these is well represented in gene-containing BACs, suggesting that LTR retrotransposons are more efficiently removed in these domains. The results demonstrate that sample sequencing and careful annotation of a few randomly selected BACs can provide a robust description of a complex plant genome.

  8. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  9. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  10. Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome

    Science.gov (United States)

    Venkatesh, Byrappa; Kirkness, Ewen F; Loh, Yong-Hwee; Halpern, Aaron L; Lee, Alison P; Johnson, Justin; Dandona, Nidhi; Viswanathan, Lakshmi D; Tay, Alice; Venter, J. Craig; Strausberg, Robert L; Brenner, Sydney

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes. PMID:17407382

  11. CRISPR/Cas9 for genome editing: progress, implications and challenges.

    Science.gov (United States)

    Zhang, Feng; Wen, Yan; Guo, Xiong

    2014-09-15

    Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) protein 9 system provides a robust and multiplexable genome editing tool, enabling researchers to precisely manipulate specific genomic elements, and facilitating the elucidation of target gene function in biology and diseases. CRISPR/Cas9 comprises of a nonspecific Cas9 nuclease and a set of programmable sequence-specific CRISPR RNA (crRNA), which can guide Cas9 to cleave DNA and generate double-strand breaks at target sites. Subsequent cellular DNA repair process leads to desired insertions, deletions or substitutions at target sites. The specificity of CRISPR/Cas9-mediated DNA cleavage requires target sequences matching crRNA and a protospacer adjacent motif locating at downstream of target sequences. Here, we review the molecular mechanism, applications and challenges of CRISPR/Cas9-mediated genome editing and clinical therapeutic potential of CRISPR/Cas9 in future.

  12. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  13. Genome-wide analysis of multiethnic cohorts identifies new loci influencing intraocular pressure and susceptibility to glaucoma

    OpenAIRE

    Hysi, Pirro G.; Cheng, Ching-Yu; Springelkamp, Henri?t; MacGregor, Stuart; Bailey, Jessica N. Cooke; Wojciechowski, Robert; Vitart, Veronique; Nag, Abhishek; Hewitt, Alex W.; H?hn, Ren?; Venturini, Cristina; Mirshahi, Alireza; Wishal D Ramdas; Thorleifsson, Gudmar; Vithana, Eranga

    2014-01-01

    Elevated intraocular pressure (IOP) is an important risk factor in developing glaucoma and IOP variability may herald glaucomatous development or progression. We report the results of a genome-wide association study meta-analysis of 18 population cohorts from the International Glaucoma Genetics Consortium (IGGC), comprising 35,296 multiethnic participants for IOP. We confirm genetic association of known loci for IOP and primary open angle glaucoma (POAG) and identify four new IOP loci located...

  14. Genome-wide analysis of multi-ancestry cohorts identifies new loci influencing intraocular pressure and susceptibility to glaucoma

    OpenAIRE

    Hysi, Pirro G.; Cheng, Ching-Yu; Springelkamp, Henriët; MacGregor, Stuart; Bailey, Jessica N. Cooke; Wojciechowski, Robert; Vitart, Veronique; Nag, Abhishek; Hewitt, Alex W.; Höhn, René; Venturini, Cristina; Mirshahi, Alireza; Wishal D Ramdas; Thorleifsson, Gudmar; Vithana, Eranga

    2014-01-01

    Elevated intraocular pressure (IOP) is an important risk factor in developing glaucoma, and variability in IOP might herald glaucomatous development or progression. We report the results of a genome-wide association study meta-analysis of 18 population cohorts from the International Glaucoma Genetics Consortium (IGGC), comprising 35,296 multi-ancestry participants for IOP. We confirm genetic association of known loci for IOP and primary open-angle glaucoma (POAG) and identify four new IOP-ass...

  15. Comparative analysis of Acinetobacters: three genomes for three lifestyles.

    Directory of Open Access Journals (Sweden)

    David Vallenet

    Full Text Available Acinetobacter baumannii is the source of numerous nosocomial infections in humans and therefore deserves close attention as multidrug or even pandrug resistant strains are increasingly being identified worldwide. Here we report the comparison of two newly sequenced genomes of A. baumannii. The human isolate A. baumannii AYE is multidrug resistant whereas strain SDF, which was isolated from body lice, is antibiotic susceptible. As reference for comparison in this analysis, the genome of the soil-living bacterium A. baylyi strain ADP1 was used. The most interesting dissimilarities we observed were that i whereas strain AYE and A. baylyi genomes harbored very few Insertion Sequence elements which could promote expression of downstream genes, strain SDF sequence contains several hundred of them that have played a crucial role in its genome reduction (gene disruptions and simple DNA loss; ii strain SDF has low catabolic capacities compared to strain AYE. Interestingly, the latter has even higher catabolic capacities than A. baylyi which has already been reported as a very nutritionally versatile organism. This metabolic performance could explain the persistence of A. baumannii nosocomial strains in environments where nutrients are scarce; iii several processes known to play a key role during host infection (biofilm formation, iron uptake, quorum sensing, virulence factors were either different or absent, the best example of which is iron uptake. Indeed, strain AYE and A. baylyi use siderophore-based systems to scavenge iron from the environment whereas strain SDF uses an alternate system similar to the Haem Acquisition System (HAS. Taken together, all these observations suggest that the genome contents of the 3 Acinetobacters compared are partly shaped by life in distinct ecological niches: human (and more largely hospital environment, louse, soil.

  16. Genome sequencing and analysis of BCG vaccine strains.

    Directory of Open Access Journals (Sweden)

    Wen Zhang

    Full Text Available BACKGROUND: Although the Bacillus Calmette-Guérin (BCG vaccine against tuberculosis (TB has been available for more than 75 years, one third of the world's population is still infected with Mycobacterium tuberculosis and approximately 2 million people die of TB every year. To reduce this immense TB burden, a clearer understanding of the functional genes underlying the action of BCG and the development of new vaccines are urgently needed. METHODS AND FINDINGS: Comparative genomic analysis of 19 M. tuberculosis complex strains showed that BCG strains underwent repeated human manipulation, had higher region of deletion rates than those of natural M. tuberculosis strains, and lost several essential components such as T-cell epitopes. A total of 188 BCG strain T-cell epitopes were lost to various degrees. The non-virulent BCG Tokyo strain, which has the largest number of T-cell epitopes (359, lost 124. Here we propose that BCG strain protection variability results from different epitopes. This study is the first to present BCG as a model organism for genetics research. BCG strains have a very well-documented history and now detailed genome information. Genome comparison revealed the selection process of BCG strains under human manipulation (1908-1966. CONCLUSIONS: Our results revealed the cause of BCG vaccine strain protection variability at the genome level and supported the hypothesis that the restoration of lost BCG Tokyo epitopes is a useful future vaccine development strategy. Furthermore, these detailed BCG vaccine genome investigation results will be useful in microbial genetics, microbial engineering and other research fields.

  17. Clinical pertinence metric enables hypothesis-independent genome-phenome analysis for neurologic diagnosis.

    Science.gov (United States)

    Segal, Michael M; Abdellateef, Mostafa; El-Hattab, Ayman W; Hilbush, Brian S; De La Vega, Francisco M; Tromp, Gerard; Williams, Marc S; Betensky, Rebecca A; Gleeson, Joseph

    2015-06-01

    We describe an "integrated genome-phenome analysis" that combines both genomic sequence data and clinical information for genomic diagnosis. It is novel in that it uses robust diagnostic decision support and combines the clinical differential diagnosis and the genomic variants using a "pertinence" metric. This allows the analysis to be hypothesis-independent, not requiring assumptions about mode of inheritance, number of genes involved, or which clinical findings are most relevant. Using 20 genomic trios with neurologic disease, we find that pertinence scores averaging 99.9% identify the causative variant under conditions in which a genomic trio is analyzed and family-aware variant calling is done. The analysis takes seconds, and pertinence scores can be improved by clinicians adding more findings. The core conclusion is that automated genome-phenome analysis can be accurate, rapid, and efficient. We also conclude that an automated process offers a methodology for quality improvement of many components of genomic analysis.

  18. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    Science.gov (United States)

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the

  19. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  20. Human genome libraries. Final progress report, February 1, 1994--August 31, 1997

    Energy Technology Data Exchange (ETDEWEB)

    Kao, Fa-Ten

    1998-01-01

    The goal of this program is to use a novel technology of chromosome microdissection and microcloning to construct chromosome region-specific libraries as resources for various human genome program studies. Region specific libraries have been constructed for the entire human chromosomes 2 and 18.

  1. Genomic analysis of extra-intestinal pathogenic Escherichia coli urosepsis.

    Science.gov (United States)

    McNally, A; Alhashash, F; Collins, M; Alqasim, A; Paszckiewicz, K; Weston, V; Diggle, M

    2013-08-01

    Urosepsis is a bacteraemia infection caused by an organism previously causing an infection in the urinary tract of a patient, a diagnosis which has been classically confirmed by culture of the same species of bacteria from both blood and urine samples. Given the new insights afforded by sequencing technologies into the complicated population structures of infectious agents affecting humans, we sought to investigate urosepsis by comparing the genome sequences of blood and urine isolates of Escherichia coli from five patients with urosepsis. The results confirm the classical urosepsis hypothesis in four of the five cases, but also show the complex nature of extra-intestinal E. coli infection in the fifth case, where three distinct strains caused two distinct infections. Additionally, we show there is little to no variation in the bacterial genome as it progressed from urine to blood, and also present a minimal set of virulence genes required for bacteraemia in E. coli based on gene association. These suggest that most E. coli have the genetic propensity to cause bacteraemia.

  2. SmashCell: A software framework for the analysis of single-cell amplified genome sequences

    DEFF Research Database (Denmark)

    Harrington, Eoghan D; Arumugam, Manimozhiyan; Raes, Jeroen;

    2010-01-01

    SUMMARY: Recent advances in single-cell manipulation technology, whole genome amplification and high-throughput sequencing have now made it possible to sequence the genome of an individual cell. The bioinformatic analysis of these genomes however is far more complicated than the analysis of those...

  3. St2-80: a new FISH marker for St genome and genome analysis in Triticeae.

    Science.gov (United States)

    Wang, Long; Shi, Qinghua; Su, Handong; Wang, Yi; Sha, Lina; Fan, Xing; Kang, Houyang; Zhang, Haiqin; Zhou, Yonghong

    2017-07-01

    The St genome is one of the most fundamental genomes in Triticeae. Repetitive sequences are widely used to distinguish different genomes or species. The primary objectives of this study were to (i) screen a new sequence that could easily distinguish the chromosome of the St genome from those of other genomes by fluorescence in situ hybridization (FISH) and (ii) investigate the genome constitution of some species that remain uncertain and controversial. We used degenerated oligonucleotide primer PCR (Dop-PCR), Dot-blot, and FISH to screen for a new marker of the St genome and to test the efficiency of this marker in the detection of the St chromosome at different ploidy levels. Signals produced by a new FISH marker (denoted St2-80) were present on the entire arm of chromosomes of the St genome, except in the centromeric region. On the contrary, St2-80 signals were present in the terminal region of chromosomes of the E, H, P, and Y genomes. No signal was detected in the A and B genomes, and only weak signals were detected in the terminal region of chromosomes of the D genome. St2-80 signals were obvious and stable in chromosomes of different genomes, whether diploid or polyploid. Therefore, St2-80 is a potential and useful FISH marker that can be used to distinguish the St genome from those of other genomes in Triticeae.

  4. Whole genome microarray analysis, from neonatal blood cards

    Directory of Open Access Journals (Sweden)

    Hogan Michael E

    2009-07-01

    Full Text Available Abstract Background Neonatal blood, obtained from a heel stick and stored dry on paper cards, has been the standard for birth defects screening for 50 years. Such dried blood samples are used, primarily, for analysis of small-molecule analytes. More recently, the DNA complement of such dried blood cards has been used for targeted genetic testing, such as for single nucleotide polymorphism in cystic fibrosis. Expansion of such testing to include polygenic traits, and perhaps whole genome scanning, has been discussed as a formal possibility. However, until now the amount of DNA that might be obtained from such dried blood cards has been limiting, due to inefficient DNA recovery technology. Results A new technology is employed for efficient DNA release from a standard neonatal blood card. Using standard Guthrie cards, stored an average of ten years post-collection, about 1/40th of the air-dried neonatal blood specimen (two 3 mm punches was processed to obtain DNA that was sufficient in mass and quality for direct use in microarray-based whole genome scanning. Using that same DNA release technology, it is also shown that approximately 1/250th of the original purified DNA (about 1 ng could be subjected to whole genome amplification, thus yielding an additional microgram of amplified DNA product. That amplified DNA product was then used in microarray analysis and yielded statistical concordance of 99% or greater to the primary, unamplified DNA sample. Conclusion Together, these data suggest that DNA obtained from less than 10% of a standard neonatal blood specimen, stored dry for several years on a Guthrie card, can support a program of genome-wide neonatal genetic testing.

  5. Comparative genomic in situ hybridization analysis on the ...

    African Journals Online (AJOL)

    AJL

    2012-04-10

    Apr 10, 2012 ... different parents/ancestors/genomes in hybrid plants to be distinguished ... sequences in common between the two species. Therefore, cGISH ... genomic organization and genome evolution in plants. (Zoller et al., 2001).

  6. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    DEFF Research Database (Denmark)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger;

    2016-01-01

    . psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates......, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which...... to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F...

  7. Progress and knowledge gaps in Culicoides genetics, genomics and population modelling: 2003 to 2014.

    Science.gov (United States)

    Carpenter, Simon

    2016-09-30

    In the 10 years, since the last international meeting on Bluetongue virus (BTV) and related Orbiviruses in Sicily, there have been huge advances in explorations of the genetics and genomics of Culicoides, culminating in the imminent release of the rst full genome de novo assembly for the genus. In parallel, mathematical models used to predict Culicoides adult distribution, seasonality, and dispersal have also increased in sophistication, re ecting advances in available computational power and expertise. While these advances have focused upon the outbreaks of BTV in Europe, there is an opportunity to extend these techniques to other regions as part of global studies of the genus. This review takes a selective approach to examining the past decade of research in these areas and provides a personal viewpoint of future directions of research that may prove productive.

  8. Progress in unraveling the genetic etiology of Parkinson disease in a genomic era.

    Science.gov (United States)

    Verstraeten, Aline; Theuns, Jessie; Van Broeckhoven, Christine

    2015-03-01

    Parkinson disease (PD) and Parkinson-plus syndromes are genetically heterogeneous neurological diseases. Initial studies into the genetic causes of PD relied on classical molecular genetic approaches in well-documented case families. More recently, these approaches have been combined with exome sequencing and together have identified 15 causal genes. Additionally, genome-wide association studies (GWASs) have discovered over 25 genetic risk factors. Elucidation of the genetic architecture of sporadic and familial parkinsonism, however, has lagged behind that of simple Mendelian conditions, suggesting the existence of features confounding genetic data interpretation. Here we discuss the successes and potential pitfalls of gene discovery in PD and related disorders in the post-genomic era. With an estimated 30% of trait variance currently unexplained, tackling current limitations will further expedite gene discovery and lead to increased application of these genetic insights in molecular diagnostics using gene panel and exome sequencing strategies. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. [Progress of genome engineering technology via clustered regularly interspaced short palindromic repeats--a review].

    Science.gov (United States)

    Li, Hao; Qiu, Shaofu; Song, Hongbin

    2013-10-04

    In survival competition with phage, bacteria and archaea gradually evolved the acquired immune system--Clustered regularly interspaced short palindromic repeats (CRISPR), presenting the trait of transcribing the crRNA and the CRISPR-associated protein (Cas) to silence or cleaving the foreign double-stranded DNA specifically. In recent years, strong interest arises in prokaryotes primitive immune system and many in-depth researches are going on. Recently, researchers successfully repurposed CRISPR as an RNA-guided platform for sequence-specific gene expression, which provides a simple approach for selectively perturbing gene expression on a genome-wide scale. It will undoubtedly bring genome engineering into a more convenient and accurate new era.

  10. Genomic Diversity and the Microenvironment as Drivers of Progression in DCIS

    Science.gov (United States)

    2015-10-01

    microenvironment, mammographic biomarkers 3. ACCOMPLISHMENTS What were the major goals of the project? Aim 1. Determine whether genetic diversity...of genetic diversity, microenvironmental diversity, and/or mammographic biomarkers can be used to predict which DCIS tumors are most likely to...series of pilot experiments to determine the best resource (Washington University) that we will use to perform the genomic sequencing of our tumors. We

  11. Fundamentals of ROC analysis and its recent progress

    Energy Technology Data Exchange (ETDEWEB)

    Fujita, Hiroshi (Gifu Univ. (Japan). Department of Electronics and Computer Engineering); Shimura, Kazuo; Shiraishi, Junji; Nishihara, Sadamitsu; Higashida, Yoshiharu; Yamashita, Kazuya

    1993-09-01

    This professional committee's report describes the results of the activities made by the Task Group for receiver operating characteristic (ROC) Analysis in Digital Radiography (DR) in 1992. Beginning with interpreting the basis of ROC analysis, explanation is made of ROC curve fitting and ROC curve's statistically significant difference test. Next, explanation is made of localization receiver operating characteristic (LROC) and free-response receiver operating characteristic (FROC) with the development from FROC to alternative free-response receiver operating characteristic (AFROC), as well as 'continuous confidence method' (provisional designation) as the recent progress in ROC analysis, including its brief experimental results. Finally, report is made in the actual condition of ROC analysis in DR system and the investigational results of ROC analysis of CAD. (author).

  12. Progress toward the analysis of complex propulsion installation flow phenomenon

    Science.gov (United States)

    Kern, P. R. A.; Hopcroft, R. G.

    1983-01-01

    A trend toward replacement of parametric model testing with parametric analysis for the design of aircraft is driven by the rapidly escalating cost of wind tunnel testing, the increasing availability of large fast computers, and powerful numerical flow algorithms. In connection with the complex flow phenomena characteristic of propulsion installations, it is now necessary to employ both parametric analysis and testing for design procedures. Powerful flow analysis techniques are available to predict local flow phenomena. However, the employment of these techniques is very expensive. It is, therefore, necessary to link these analyses with less powerful and less expensive procedures for an accurate analysis of propulsion installation flowfields. However, the interfacing and coupling processes needed are not available. The present investigation is concerned with progress made regarding the development of suitable linking methods. Attention is given to methods of analysis for predicting the flow around a nacelle coupled to a highly swept wing.

  13. Integrative Genomics with Mediation Analysis in a Survival Context

    Directory of Open Access Journals (Sweden)

    Szilárd Nemes

    2013-01-01

    Full Text Available DNA copy number aberrations (DCNA and subsequent altered gene expression profiles may have a major impact on tumor initiation, on development, and eventually on recurrence and cancer-specific mortality. However, most methods employed in integrative genomic analysis of the two biological levels, DNA and RNA, do not consider survival time. In the present note, we propose the adoption of a survival analysis-based framework for the integrative analysis of DCNA and mRNA levels to reveal their implication on patient clinical outcome with the prerequisite that the effect of DCNA on survival is mediated by mRNA levels. The specific aim of the paper is to offer a feasible framework to test the DCNA-mRNA-survival pathway. We provide statistical inference algorithms for mediation based on asymptotic results. Furthermore, we illustrate the applicability of the method in an integrative genomic analysis setting by using a breast cancer data set consisting of 141 invasive breast tumors. In addition, we provide implementation in R.

  14. Integrated genomic analysis of survival outliers in glioblastoma.

    Science.gov (United States)

    Peng, Sen; Dhruv, Harshil; Armstrong, Brock; Salhia, Bodour; Legendre, Christophe; Kiefer, Jeffrey; Parks, Julianna; Virk, Selene; Sloan, Andrew E; Ostrom, Quinn T; Barnholtz-Sloan, Jill S; Tran, Nhan L; Berens, Michael E

    2017-06-01

    To elucidate molecular features associated with disproportionate survival of glioblastoma (GB) patients, we conducted deep genomic comparative analysis of a cohort of patients receiving standard therapy (surgery plus concurrent radiation and temozolomide); "GB outliers" were identified: long-term survivor of 33 months (LTS; n = 8) versus short-term survivor of 7 months (STS; n = 10). We implemented exome, RNA, whole genome sequencing, and DNA methylation for collection of deep genomic data from STS and LTS GB patients. LTS GB showed frequent chromosomal gains in 4q12 (platelet derived growth factor receptor alpha and KIT) and 12q14.1 (cyclin-dependent kinase 4), and deletion in 19q13.33 (BAX, branched chain amino-acid transaminase 2, and cluster of differentiation 33). STS GB showed frequent deletion in 9p11.2 (forkhead box D4-like 2 and aquaporin 7 pseudogene 3) and 22q11.21 (Hypermethylated In Cancer 2). LTS GB showed 2-fold more frequent copy number deletions compared with STS GB. Gene expression differences showed the STS cohort with altered transcriptional regulators: activation of signal transducer and activator of transcription (STAT)5a/b, nuclear factor-kappaB (NF-κB), and interferon-gamma (IFNG), and inhibition of mitogen-activated protein kinase (MAPK1), extracellular signal-regulated kinase (ERK)1/2, and estrogen receptor (ESR)1. Expression-based biological concepts prominent in the STS cohort include metabolic processes, anaphase-promoting complex degradation, and immune processes associated with major histocompatibility complex class I antigen presentation; the LTS cohort features genes related to development, morphogenesis, and the mammalian target of rapamycin signaling pathway. Whole genome methylation analyses showed that a methylation signature of 89 probes distinctly separates LTS from STS GB tumors. We posit that genomic instability is associated with longer survival of GB (possibly with vulnerability to standard therapy); conversely, genomic

  15. Multiuser detection and independent component analysis-Progress and perspective

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The latest progress in the multiuser detection and independent component analysis (ICA) is reviewed systematically. Then two novel classes of multiuser detection methods based on ICA algorithms and feedforward neural networks are proposed. Theoretical analysis and computer simulation show that ICA algorithms are effective to detect multiuser signals in code-division multiple-access (CDMA) system. The performances of these methods are not identical entirely in various channels, but all of them are robust, efficient, fast and suitable for real-time implementations.

  16. Recent Progresses in Nanobiosensing for Food Safety Analysis

    Directory of Open Access Journals (Sweden)

    Tao Yang

    2016-07-01

    Full Text Available With increasing adulteration, food safety analysis has become an important research field. Nanomaterials-based biosensing holds great potential in designing highly sensitive and selective detection strategies necessary for food safety analysis. This review summarizes various function types of nanomaterials, the methods of functionalization of nanomaterials, and recent (2014–present progress in the design and development of nanobiosensing for the detection of food contaminants including pathogens, toxins, pesticides, antibiotics, metal contaminants, and other analytes, which are sub-classified according to various recognition methods of each analyte. The existing shortcomings and future perspectives of the rapidly growing field of nanobiosensing addressing food safety issues are also discussed briefly.

  17. [Research progresses of anabolic steroids analysis in doping control].

    Science.gov (United States)

    Long, Yuanyuan; Wang, Dingzhong; Li, Ke'an; Liu, Feng

    2008-07-01

    Anabolic steroids, a kind of physiological active substance, are widely abused to improve athletic performance in human sports. They have been forbidden in sports by the International Olympic Committee since 1983. Since then, many researchers have been focusing their attentions on the establishment of reliable detection methods. In this paper, we review the research progresses of different analytical methods for anabolic steroids since 2002, such as gas chromatography-mass spectrometry, liquid chromatography-mass spectrometry, immunoassay, electrochemistry analysis and mass spectrometry. The developing prospect of anabolic steroids analysis is also discussed.

  18. Analysis of the complete Fischoederius elongatus (Paramphistomidae, Trematoda) mitochondrial genome.

    Science.gov (United States)

    Yang, Xin; Zhao, Yunyang; Wang, Lixia; Feng, Hanli; Tan, Li; Lei, Weiqiang; Zhao, Pengfei; Hu, Min; Fang, Rui

    2015-05-20

    Fischoederius elongates is an important trematode of Paramphistomes in ruminants. Animals infected with F. elongates often don't show obvious symptoms, so it is easy to be ignored. However it can cause severe economic losses to the breeding industry. Knowledge of the mitochondrial genome of F. elongates can be used for phylogenetic and epidemiological studies. The complete mt genome sequence of F. elongates is 14,120 bp in length and contains 12 protein-coding genes, 22 tRNA genes, two rRNA genes and two non-coding regions (LNR and SNR). The gene arrangement of F. elongates is the same as other trematodes, such as Fasciola hepatica and Paramphistomum cervi. Phylogenetic analyses using concatenated amino acid sequences of the 12 protein-coding genes by Maximum-likelihood and Neighbor-joining analysis method showed that F. elongates was closely related to P. cervi. The complete mt genome sequence of F. elongates should provide information for phylogenetic and epidemiological studies for F. elongates and the family Paramphistomidae.

  19. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    Science.gov (United States)

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites.

  20. Comparative genome analysis of Bacillus cereus group genomes with Bacillus subtilis

    OpenAIRE

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch, Gordon; Liolios, Konstantinos; Grechkin, Yuri

    2005-01-01

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-...

  1. Evolutionary insights from suffix array-based genome sequence analysis

    Indian Academy of Sciences (India)

    Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan

    2007-08-01

    Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a ‘meaning’ for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG, coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.

  2. Statistical analysis of simple repeats in the human genome

    Science.gov (United States)

    Piazza, F.; Liò, P.

    2005-03-01

    The human genome contains repetitive DNA at different level of sequence length, number and dispersion. Highly repetitive DNA is particularly rich in homo- and di-nucleotide repeats, while middle repetitive DNA is rich of families of interspersed, mobile elements hundreds of base pairs (bp) long, among which belong the Alu families. A link between homo- and di-polymeric tracts and mobile elements has been recently highlighted. In particular, the mobility of Alu repeats, which form 10% of the human genome, has been correlated with the length of poly(A) tracts located at one end of the Alu. These tracts have a rigid and non-bendable structure and have an inhibitory effect on nucleosomes, which normally compact the DNA. We performed a statistical analysis of the genome-wide distribution of lengths and inter-tract separations of poly(X) and poly(XY) tracts in the human genome. Our study shows that in humans the length distributions of these sequences reflect the dynamics of their expansion and DNA replication. By means of general tools from linguistics, we show that the latter play the role of highly-significant content-bearing terms in the DNA text. Furthermore, we find that such tracts are positioned in a non-random fashion, with an apparent periodicity of 150 bases. This allows us to extend the link between repetitive, highly mobile elements such as Alus and low-complexity words in human DNA. More precisely, we show that Alus are sources of poly(X) tracts, which in turn affect in a subtle way the combination and diversification of gene expression and the fixation of multigene families.

  3. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    Directory of Open Access Journals (Sweden)

    Cassidy L Klima

    Full Text Available Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1 and 6 (S6 isolated from pneumonic lesions and serotype 2 (S2 found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design

  4. Comparative analysis of whole-genome sequences of Streptococcus suis

    Institute of Scientific and Technical Information of China (English)

    LI Pengli; WEI Wu; LI Yixue; MA Yuanyuan; DING Guohui; LI Xiaoping; WANG Xiaojing; ZHANG Liwen; SUN Jingchun; WANG Yong; TU Kang; WANG Ningning; HAO Pei; WANG Chuan; CAO Zhiwei; SHI Tieliu

    2006-01-01

    The outbreak of Streptococcus suis recently in some districts of Sichuan Province in China has caused over 30 deaths and over 200 infections in human beings. In order to study the pathogenicity mechanism and to prevent the bacteria from spreading and infecting human beings and swine, we have annotated and analyzed the genomes of two strains, Streptococcus suis P1/7 and 89-1591 respectively. The whole length of P1/7 is 2.007 Mb,and has 1969 ORFs. In contrast, the partial genome sequence of 89-1591 is 1.98 Mb in length and exists in 177 contigs with 1918 ORFs. Analysis shows that the average lengths of CDSs in two genomes are very close, and the numbers of the homolog ORFs are 1306 between those two strains. Most of the toxicity factors of the two strains are homologeous, but there are still some significant differences between those two strains. For example, among the 11 genes (cps2A-cps2K) encoding for the capsules in P1/7, 4(cps2A, 2B, 2I, 2J) are not detected in strain 89-1591.At the same time, the genes encoding EF and Haemolysin in P1/7 are also not found in strain 89-1591. Besides, the genes related to DNA replication, repair and recombination differ from each other significantly and there also exist certain differences among the surface proteins. Those characteristics indicate that those two strains have evolved their own specific functions to adapt to the different environments and that the pathogenesis of the two strains is different. We have accumulated comprehensive genomics information for future systematic studies of S.sui. Our results are helpful for disease prevention,vaccine development, as well as drug design for S.suis.

  5. Genome-wide Analysis Identifies Novel Loci Associated with Ovarian Cancer Outcomes

    DEFF Research Database (Denmark)

    Johnatty, Sharon E; Tyrer, Jonathan P; Kar, Siddhartha;

    2015-01-01

    PURPOSE: Chemotherapy resistance remains a major challenge in the treatment of ovarian cancer. We hypothesize that germline polymorphisms might be associated with clinical outcome. EXPERIMENTAL DESIGN: We analyzed approximately 2.8 million genotyped and imputed SNPs from the iCOGS experiment...... for progression-free survival (PFS) and overall survival (OS) in 2,901 European epithelial ovarian cancer (EOC) patients who underwent first-line treatment of cytoreductive surgery and chemotherapy regardless of regimen, and in a subset of 1,098 patients treated with ≥ 4 cycles of paclitaxel and carboplatin...... at standard doses. We evaluated the top SNPs in 4,434 EOC patients, including patients from The Cancer Genome Atlas. In addition, we conducted pathway analysis of all intragenic SNPs and tested their association with PFS and OS using gene set enrichment analysis. RESULTS: Five SNPs were significantly...

  6. Dual Roles of RNF2 in Melanoma Progression | Office of Cancer Genomics

    Science.gov (United States)

    Epigenetic regulators have emerged as critical factors governing the biology of cancer. Here, in the context of melanoma, we show that RNF2 is prognostic, exhibiting progression-correlated expression in human melanocytic neoplasms. Through a series of complementary gain-of-function and loss-of-function studies in mouse and human systems, we establish that RNF2 is oncogenic and prometastatic.

  7. Genomic Diversity and the Microenvironment as Drivers of Progression in DCIS

    Science.gov (United States)

    2016-10-01

    communication. This multi-disciplinary progress puts our group into an ideal position to fully implement the aims of the project and reach our year 3 and 4...were upstaged to invasive disease at definitive surgery. The other half of 99 testing subjects have been set aside for aim 3b work. For the first

  8. Genomically amplified Akt3 activates DNA repair pathway and promotes glioma progression.

    Science.gov (United States)

    Turner, Kristen M; Sun, Youting; Ji, Ping; Granberg, Kirsi J; Bernard, Brady; Hu, Limei; Cogdell, David E; Zhou, Xinhui; Yli-Harja, Olli; Nykter, Matti; Shmulevich, Ilya; Yung, W K Alfred; Fuller, Gregory N; Zhang, Wei

    2015-03-17

    Akt is a robust oncogene that plays key roles in the development and progression of many cancers, including glioma. We evaluated the differential propensities of the Akt isoforms toward progression in the well-characterized RCAS/Ntv-a mouse model of PDGFB-driven low grade glioma. A constitutively active myristoylated form of Akt1 did not induce high-grade glioma (HGG). In stark contrast, Akt2 and Akt3 showed strong progression potential with 78% and 97% of tumors diagnosed as HGG, respectively. We further revealed that significant variations in polarity and hydropathy values among the Akt isoforms in both the pleckstrin homology domain (P domain) and regulatory domain (R domain) were critical in mediating glioma progression. Gene expression profiles from representative Akt-derived tumors indicated dominant and distinct roles for Akt3, consisting primarily of DNA repair pathways. TCGA data from human GBM closely reflected the DNA repair function, as Akt3 was significantly correlated with a 76-gene signature DNA repair panel. Consistently, compared with Akt1 and Akt2 overexpression models, Akt3-expressing human GBM cells had enhanced activation of DNA repair proteins, leading to increased DNA repair and subsequent resistance to radiation and temozolomide. Given the wide range of Akt3-amplified cancers, Akt3 may represent a key resistance factor.

  9. Recombination analysis based on the complete genome of bocavirus

    Directory of Open Access Journals (Sweden)

    Chen Shengxia

    2011-04-01

    Full Text Available Abstract Bocavirus include bovine parvovirus, minute virus of canine, porcine bocavirus, gorilla bocavirus, and Human bocaviruses 1-4 (HBoVs. Although recent reports showed that recombination happened in bocavirus, no systematical study investigated the recombination of bocavirus. The present study performed the phylogenetic and recombination analysis of bocavirus over the complete genomes available in GenBank. Results confirmed that recombination existed among bocavirus, including the likely inter-genotype recombination between HBoV1 and HBoV4, and intra-genotype recombination among HBoV2 variants. Moreover, it is the first report revealing the recombination that occurred between minute viruses of canine.

  10. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    DEFF Research Database (Denmark)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger;

    2016-01-01

    to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F......, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which...

  11. Genomic analysis of primordial dwarfism reveals novel disease genes.

    Science.gov (United States)

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S

    2014-02-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.

  12. Genome sequence and analysis of the tuber crop potato

    DEFF Research Database (Denmark)

    Xu, X.; Pan, S.; Cheng, S.

    2011-01-01

    and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade...

  13. Project manager insights: An analysis of career progression

    Directory of Open Access Journals (Sweden)

    James W Marion

    2014-08-01

    Full Text Available The project manager is key to the success of any project.  But the path to becoming a successful project manager is ill defined.  In this study, the authors analyzed interview results of 87 project managers’ responses to questions associated with entry into the field, career progression, and advice for the new project manager, seeking to better understand practicing project manager career progression.  Qualitative analysis techniques were used to identify recurring themes from the interview summaries. The themes and the resulting conceptual framework provide evidence that supports the development of successful project manager career path. Further, the results suggest individual project management competencies in soft skills as a key enabler of project execution.

  14. BioMet Toolbox: genome-wide analysis of metabolism

    DEFF Research Database (Denmark)

    Cvijovic, M.; Olivares Hernandez, Roberto; Agren, R.

    2010-01-01

    models. Systematic analysis of biological processes by means of modelling and simulations has made the identification of metabolic networks and prediction of metabolic capabilities under different conditions possible. For facilitating such systemic analysis, we have developed the BioMet Toolbox, a web......-based resource for stoichiometric analysis and for integration of transcriptome and interactome data, thereby exploiting the capabilities of genome-scale metabolic models. The BioMet Toolbox provides an effective user-friendly way to perform linear programming simulations towards maximized or minimized growth...... rates, substrate uptake rates and metabolic production rates by detecting relevant fluxes, simulate single and double gene deletions or detect metabolites around which major transcriptional changes are concentrated. These tools can be used for high-throughput in silico screening and allows fully...

  15. Comparative analysis of genomic signal processing for microarray data clustering.

    Science.gov (United States)

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  16. The Human Genome Project and Mental Retardation: An Educational Program. Final Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Davis, Sharon

    1999-05-03

    The Arc, a national organization on mental retardation, conducted an educational program for members, many of whom have a family member with a genetic condition causing mental retardation. The project informed members about the Human Genome scientific efforts, conducted training regarding ethical, legal and social implications and involved members in issue discussions. Short reports and fact sheets on genetic and ELSI topics were disseminated to 2,200 of the Arc's leaders across the country and to other interested individuals. Materials produced by the project can e found on the Arc's web site, TheArc.org.

  17. Current progress in the biology of members of the Sporothrix schenckii complex following the genomic era.

    Science.gov (United States)

    Mora-Montes, Héctor M; Dantas, Alessandra da Silva; Trujillo-Esquivel, Elías; de Souza Baptista, Andrea R; Lopes-Bezerra, Leila M

    2015-09-01

    Sporotrichosis has been attributed for more than a century to one single etiological agent, Sporothrix schencki. Only eight years ago, it was described that, in fact, the disease is caused by several pathogenic cryptic species. The present review will focus on recent advances to understand the biology and virulence of epidemiologically relevant pathogenic species of the S. schenckii complex. The main subjects covered are the new clinical and epidemiological aspects including diagnostic and therapeutic challenges, the development of molecular tools, the genome database and the perspectives for study of virulence of emerging Sporothrix species.

  18. Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Chen Jiun-Ching

    2007-05-01

    Full Text Available Abstract Background Genome-wide identification of specific oligonucleotides (oligos is a computationally-intensive task and is a requirement for designing microarray probes, primers, and siRNAs. An artificial neural network (ANN is a machine learning technique that can effectively process complex and high noise data. Here, ANNs are applied to process the unique subsequence distribution for prediction of specific oligos. Results We present a novel and efficient algorithm, named the integration of ANN and BLAST (IAB algorithm, to identify specific oligos. We establish the unique marker database for human and rat gene index databases using the hash table algorithm. We then create the input vectors, via the unique marker database, to train and test the ANN. The trained ANN predicted the specific oligos with high efficiency, and these oligos were subsequently verified by BLAST. To improve the prediction performance, the ANN over-fitting issue was avoided by early stopping with the best observed error and a k-fold validation was also applied. The performance of the IAB algorithm was about 5.2, 7.1, and 6.7 times faster than the BLAST search without ANN for experimental results of 70-mer, 50-mer, and 25-mer specific oligos, respectively. In addition, the results of polymerase chain reactions showed that the primers predicted by the IAB algorithm could specifically amplify the corresponding genes. The IAB algorithm has been integrated into a previously published comprehensive web server to support microarray analysis and genome-wide iterative enrichment analysis, through which users can identify a group of desired genes and then discover the specific oligos of these genes. Conclusion The IAB algorithm has been developed to construct SpecificDB, a web server that provides a specific and valid oligo database of the probe, siRNA, and primer design for the human genome. We also demonstrate the ability of the IAB algorithm to predict specific oligos through

  19. Severe accident analysis using dynamic accident progression event trees

    Science.gov (United States)

    Hakobyan, Aram P.

    In present, the development and analysis of Accident Progression Event Trees (APETs) are performed in a manner that is computationally time consuming, difficult to reproduce and also can be phenomenologically inconsistent. One of the principal deficiencies lies in the static nature of conventional APETs. In the conventional event tree techniques, the sequence of events is pre-determined in a fixed order based on the expert judgments. The main objective of this PhD dissertation was to develop a software tool (ADAPT) for automated APET generation using the concept of dynamic event trees. As implied by the name, in dynamic event trees the order and timing of events are determined by the progression of the accident. The tool determines the branching times from a severe accident analysis code based on user specified criteria for branching. It assigns user specified probabilities to every branch, tracks the total branch probability, and truncates branches based on the given pruning/truncation rules to avoid an unmanageable number of scenarios. The function of a dynamic APET developed includes prediction of the conditions, timing, and location of containment failure or bypass leading to the release of radioactive material, and calculation of probabilities of those failures. Thus, scenarios that can potentially lead to early containment failure or bypass, such as through accident induced failure of steam generator tubes, are of particular interest. Also, the work is focused on treatment of uncertainties in severe accident phenomena such as creep rupture of major RCS components, hydrogen burn, containment failure, timing of power recovery, etc. Although the ADAPT methodology (Analysis of Dynamic Accident Progression Trees) could be applied to any severe accident analysis code, in this dissertation the approach is demonstrated by applying it to the MELCOR code [1]. A case study is presented involving station blackout with the loss of auxiliary feedwater system for a

  20. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  1. Comparative Genomics and Transcriptomic Analysis of Mycobacterium Kansasii

    KAUST Repository

    Alzahid, Yara

    2014-04-01

    The group of Mycobacteria is one of the most intensively studied bacterial taxa, as they cause the two historical and worldwide known diseases: leprosy and tuberculosis. Mycobacteria not identified as tuberculosis or leprosy complex, have been referred to by ‘environmental mycobacteria’ or ‘Nontuberculous mycobacteria (NTM). Mycobacterium kansasii (M. kansasii) is one of the most frequent NTM pathogens, as it causes pulmonary disease in immuno-competent patients and pulmonary, and disseminated disease in patients with various immuno-deficiencies. There have been five documented subtypes of this bacterium, by different molecular typing methods, showing that type I causes tuberculosis-like disease in healthy individuals, and type II in immune-compromised individuals. The remaining types are said to be environmental, thereby, not causing any diseases. The aim of this project was to conduct a comparative genomic study of M. kansasii types I-V and investigating the gene expression level of those types. From various comparative genomics analysis, provided genomics evidence on why M. kansasii type I is considered pathogenic, by focusing on three key elements that are involved in virulence of Mycobacteria: ESX secretion system, Phospholipase c (plcb) and Mammalian cell entry (Mce) operons. The results showed the lack of the espA operon in types II-V, which renders the ESX- 1 operon dysfunctional, as espA is one of the key factors that control this secretion system. However, gene expression analysis showed this operon to be deleted in types II, III and IV. Furthermore, plcB was found to be truncated in types III and IV. Analysis of Mce operons (1-4) show that mce-1 operon is duplicated, mce-2 is absent and mce-3 and mce-4 is present in one copy in M. kansasii types I-V. Gene expression profiles of type I-IV, showed that the secreted proteins of ESX-1 were slightly upregulated in types II-IV when compared to type I and the secreted forms of ESX-5 were highly down

  2. Genomic changes defining the progression of human colorectal and cervical tumors

    OpenAIRE

    1996-01-01

    Defining changes during the carcinogenesis and progression of tumors is a major way to obtain a better understanding of the mechanisms of cancer development. We therefore investigated the cacinogenesis process in the colon-rectum and in the uterine cervix by different cellchemical, immunohistochemical and cytogenetic methods. Cell proliferation, assessed by immunohistochemical detection of the Ki-67 antigen (MIB 1 antibody), DNA ploify, determined by image cytometry, e...

  3. Dating the age of admixture via wavelet transform analysis of genome-wide data

    NARCIS (Netherlands)

    I. Pugach (Irina); R. Matveyev (Rostislav); A. Wollstein (Andreas); M.H. Kayser (Manfred); M. Stoneking (Mark)

    2011-01-01

    textabstractWe describe a PCA-based genome scan approach to analyze genome-wide admixture structure, and introduce wavelet transform analysis as a method for estimating the time of admixture. We test the wavelet transform method with simulations and apply it to genome-wide SNP data from eight admixe

  4. Technologies and techniques for analysis and use of genome information, 1997; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1998-03-01

    The paper clarified the whole image of cell functions by elucidating the function and manifestation control mechanism of genes existing in genomes, and the network of their interactions, and surveyed applicability of the useful functions obtained of cells and proteins to the industrial field. The survey was made from a viewpoint of the fields of both biology and information science. Especially, based on the function-known DNA base sequence database, the following technologies were surveyed: technology to predict the function of the function-unknown DNA base sequence, search/separation technology to acquire the genes to be functionally elucidated in a state of being suitable for manifestation, technology to get perfect proteins by effectively manifesting the genes to be functionally elucidated, and technology to analyze the function of the proteins obtained by manifestation of genes. Further, the International Symposium was held which is titled `Genome Research Opens a New World to Bioindustry (New Developments in Genome Informatics Technologies). With the future progress of technology to decipher and use genome information, the construction of much newer genome industry is anticipated. 165 refs., 44 figs., 10 tabs.

  5. IMG 4 version of the integrated microbial genomes comparative analysis system

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chen, I-Min A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Palaniappan, Krishna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chu, Ken [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Szeto, Ernest [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Pillay, Manoj [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Ratner, Anna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Huang, Jinghua [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Woyke, Tanja [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Huntemann, Marcel [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Anderson, Iain [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Billis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Varghese, Neha [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Mavromatis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Pati, Amrita [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Ivanova, Natalia N. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  6. IMG 4 version of the integrated microbial genomes comparative analysis system.

    Science.gov (United States)

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  7. [The Mycobacterium leprae genome: from sequence analysis to therapeutic implications].

    Science.gov (United States)

    Honore, N

    2002-01-01

    The genome of Mycobacterium leprae, the causative agent of leprosy, was analyzed by rapid sequencing of cosmids and plasmids prepared from DNA isolated from one patient's strain. Results showed that the bacillus possesses a single circular chromosome that differs from other known mycobacterium chromosomes with regard to size (3.2 Mb) and G + C content (57.8%). Computer analysis demonstrated that only half of the sequence contains protein-coding genes. The other half contains pseudogenes and non-coding sequences. These findings indicate that M. leprae has undergone a major reductive evolution leaving a minimal set of functional genes for survival. Study of the coding region of the sequence provides evidence accounting for the particular pathogenic properties of M. leprae which is an obligate intracellular parasite. Disappearance of numerous enzymatic pathways in comparison with M. tuberculosis, an intracellular pathogen comparable to M. leprae, could explain the differences observed between the two organisms. Genomic analysis of the leprosy bacillus also provided insight into the molecular basis for resistance to various antibiotics and allowed identification of several potential targets for new drug treatments.

  8. Improving livestock for agriculture - technological progress from random transgenesis to precision genome editing heralds a new era.

    Science.gov (United States)

    Laible, Götz; Wei, Jingwei; Wagner, Stefan

    2015-01-01

    Humans have a long history in shaping the genetic makeup of livestock to optimize production and meet growing human demands for food and other animal products. Until recently, this has only been possible through traditional breeding and selection, which is a painstakingly slow process of accumulating incremental gains over a long period. The development of transgenic livestock technology offers a more direct approach with the possibility for making genetic improvements with greater impact and within a single generation. However, initially the technology was hampered by technical difficulties and limitations, which have now largely been overcome by progressive improvements over the past 30 years. Particularly, the advent of genome editing in combination with homologous recombination has added a new level of efficiency and precision that holds much promise for the genetic improvement of livestock using the increasing knowledge of the phenotypic impact of genetic sequence variants. So far not a single line of transgenic livestock has gained approval for commercialization. The step change to genome-edited livestock with precise sequence changes may accelerate the path to market, provided applications of this new technology for agriculture can deliver, in addition to economic incentives for producers, also compelling benefits for animals, consumers, and the environment.

  9. Listeria Genomics

    Science.gov (United States)

    Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

    The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.

  10. Analysis of dinucleotide signatures in HIV-1 subtype B genomes

    Indian Academy of Sciences (India)

    Aridaman Pandit; Jyothirmayi Vadlamudi; Somdatta Sinha

    2013-12-01

    Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007.We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection.

  11. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Gadea Jose; Forment Javier; Santiago Julia; Marques M Carmen; Juarez Jose; Mauri Nuria; Martinez-Godoy M Angeles

    2008-01-01

    Abstract Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-...

  12. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

    2008-01-01

    Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA...

  13. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  14. Analysis Of Segmental Duplications In The Pig Genome Based On Next-Generation Sequencing

    DEFF Research Database (Denmark)

    Fadista, João; Bendixen, Christian

    extensively studied in other organisms, its analysis in pig has been hampered by the lack of a complete pig genome assembly. By measuring the depth of coverage of Illumina whole-genome shotgun sequencing reads of the Tabasco animal aligned to the latest pig genome assembly (Sus scrofa 10 – based also...... on Tabasco), led us to the detection of a high-resolution map of segmental duplications in the pig genome. Comparing these segments with four other Duroc animals sequenced at our institute, supplied the resources needed to describe the first genome-wide and systematic analysis of segmental duplications...

  15. Genome-Wide Analysis Reveals Coating of the Mitochondrial Genome by TFAM

    OpenAIRE

    Wang, Yun E.; Marinov, Georgi K.; Wold, Barbara J.; Chan, David C.

    2013-01-01

    Mitochondria contain a 16.6 kb circular genome encoding 13 proteins as well as mitochondrial tRNAs and rRNAs. Copies of the genome are organized into nucleoids containing both DNA and proteins, including the machinery required for mtDNA replication and transcription. The transcription factor TFAM is critical for initiation of transcription and replication of the genome, and is also thought to perform a packaging function. Although specific binding sites required for initiation of transcriptio...

  16. Transcriptome, methylome and genomic variations analysis of ectopic thyroid glands.

    Directory of Open Access Journals (Sweden)

    Rasha Abu-Khudir

    Full Text Available BACKGROUND: Congenital hypothyroidism from thyroid dysgenesis (CHTD is predominantly a sporadic disease characterized by defects in the differentiation, migration or growth of thyroid tissue. Of these defects, incomplete migration resulting in ectopic thyroid tissue is the most common (up to 80%. Germinal mutations in the thyroid-related transcription factors NKX2.1, FOXE1, PAX-8, and NKX2.5 have been identified in only 3% of patients with sporadic CHTD. Moreover, a survey of monozygotic twins yielded a discordance rate of 92%, suggesting that somatic events, genetic or epigenetic, probably play an important role in the etiology of CHTD. METHODOLOGY/PRINCIPAL FINDINGS: To assess the role of somatic genetic or epigenetic processes in CHTD, we analyzed gene expression, genome-wide methylation, and structural genome variations in normal versus ectopic thyroid tissue. In total, 1011 genes were more than two-fold induced or repressed. Expression array was validated by quantitative real-time RT-PCR for 100 genes. After correction for differences in thyroid activation state, 19 genes were exclusively associated with thyroid ectopy, among which genes involved in embryonic development (e.g. TXNIP and in the Wnt pathway (e.g. SFRP2 and FRZB were observed. None of the thyroid related transcription factors (FOXE1, HHEX, NKX2.1, NKX2.5 showed decreased expression, whereas PAX8 expression was associated with thyroid activation state. Finally, the expression profile was independent of promoter and CpG island methylation and of structural genome variations. CONCLUSIONS/SIGNIFICANCE: This is the first integrative molecular analysis of ectopic thyroid tissue. Ectopic thyroids show a differential gene expression compared to that of normal thyroids, although molecular basis could not be defined. Replication of this pilot study on a larger cohort could lead to unraveling the elusive cause of defective thyroid migration during embryogenesis.

  17. NMD Microarray Analysis for Rapid Genome-Wide Screen of Mutated Genes in Cancer

    Directory of Open Access Journals (Sweden)

    Maija Wolf

    2005-01-01

    Full Text Available Gene mutations play a critical role in cancer development and progression, and their identification offers possibilities for accurate diagnostics and therapeutic targeting. Finding genes undergoing mutations is challenging and slow, even in the post-genomic era. A new approach was recently developed by Noensie and Dietz to prioritize and focus the search, making use of nonsense-mediated mRNA decay (NMD inhibition and microarray analysis (NMD microarrays in the identification of transcripts containing nonsense mutations. We combined NMD microarrays with array-based CGH (comparative genomic hybridization in order to identify inactivation of tumor suppressor genes in cancer. Such a “mutatomics” screening of prostate cancer cell lines led to the identification of inactivating mutations in the EPHB2 gene. Up to 8% of metastatic uncultured prostate cancers also showed mutations of this gene whose loss of function may confer loss of tissue architecture. NMD microarray analysis could turn out to be a powerful research method to identify novel mutated genes in cancer cell lines, providing targets that could then be further investigated for their clinical relevance and therapeutic potential.

  18. Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches

    Directory of Open Access Journals (Sweden)

    Changwei Bi

    2016-01-01

    Full Text Available Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants.

  19. Comparative genomic analysis of Vibrio parahaemolyticus: serotype conversion and virulence

    Directory of Open Access Journals (Sweden)

    Gil Ana I

    2011-06-01

    Full Text Available Abstract Background Vibrio parahaemolyticus is a common cause of foodborne disease. Beginning in 1996, a more virulent strain having serotype O3:K6 caused major outbreaks in India and other parts of the world, resulting in the emergence of a pandemic. Other serovariants of this strain emerged during its dissemination and together with the original O3:K6 were termed strains of the pandemic clone. Two genomes, one of this virulent strain and one pre-pandemic strain have been sequenced. We sequenced four additional genomes of V. parahaemolyticus in this study that were isolated from different geographical regions and time points. Comparative genomic analyses of six strains of V. parahaemolyticus isolated from Asia and Peru were performed in order to advance knowledge concerning the evolution of V. parahaemolyticus; specifically, the genetic changes contributing to serotype conversion and virulence. Two pre-pandemic strains and three pandemic strains, isolated from different geographical regions, were serotype O3:K6 and either toxin profiles (tdh+, trh- or (tdh-, trh+. The sixth pandemic strain sequenced in this study was serotype O4:K68. Results Genomic analyses revealed that the trh+ and tdh+ strains had different types of pathogenicity islands and mobile elements as well as major structural differences between the tdh pathogenicity islands of the pre-pandemic and pandemic strains. In addition, the results of single nucleotide polymorphism (SNP analysis showed that 94% of the SNPs between O3:K6 and O4:K68 pandemic isolates were within a 141 kb region surrounding the O- and K-antigen-encoding gene clusters. The "core" genes of V. parahaemolyticus were also compared to those of V. cholerae and V. vulnificus, in order to delineate differences between these three pathogenic species. Approximately one-half (49-59% of each species' core genes were conserved in all three species, and 14-24% of the core genes were species-specific and in different

  20. Analysis of chimpanzee history based on genome sequence alignments.

    Directory of Open Access Journals (Sweden)

    Jennifer L Caswell

    2008-04-01

    Full Text Available Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously available. We show that bonobos and common chimpanzees were separated approximately 1,290,000 years ago, western and other common chimpanzees approximately 510,000 years ago, and eastern and central chimpanzees at least 50,000 years ago. We infer that the central chimpanzee population size increased by at least a factor of 4 since its separation from western chimpanzees, while the western chimpanzee effective population size decreased. Surprisingly, in about one percent of the genome, the genetic relationships between humans, chimpanzees, and bonobos appear to be different from the species relationships. We used PCR-based resequencing to confirm 11 regions where chimpanzees and bonobos are not most closely related. Study of such loci should provide information about the period of time 5-7 million years ago when the ancestors of humans separated from those of the chimpanzees.

  1. Delineation of Steroid-Degrading Microorganisms through Comparative Genomic Analysis

    Directory of Open Access Journals (Sweden)

    Lee H. Bergstrand

    2016-03-01

    Full Text Available Steroids are ubiquitous in natural environments and are a significant growth substrate for microorganisms. Microbial steroid metabolism is also important for some pathogens and for biotechnical applications. This study delineated the distribution of aerobic steroid catabolism pathways among over 8,000 microorganisms whose genomes are available in the NCBI RefSeq database. Combined analysis of bacterial, archaeal, and fungal genomes with both hidden Markov models and reciprocal BLAST identified 265 putative steroid degraders within only Actinobacteria and Proteobacteria, which mainly originated from soil, eukaryotic host, and aquatic environments. These bacteria include members of 17 genera not previously known to contain steroid degraders. A pathway for cholesterol degradation was conserved in many actinobacterial genera, particularly in members of the Corynebacterineae, and a pathway for cholate degradation was conserved in members of the genus Rhodococcus. A pathway for testosterone and, sometimes, cholate degradation had a patchy distribution among Proteobacteria. The steroid degradation genes tended to occur within large gene clusters. Growth experiments confirmed bioinformatic predictions of steroid metabolism capacity in nine bacterial strains. The results indicate there was a single ancestral 9,10-seco-steroid degradation pathway. Gene duplication, likely in a progenitor of Rhodococcus, later gave rise to a cholate degradation pathway. Proteobacteria and additional Actinobacteria subsequently obtained a cholate degradation pathway via horizontal gene transfer, in some cases facilitated by plasmids. Catabolism of steroids appears to be an important component of the ecological niches of broad groups of Actinobacteria and individual species of Proteobacteria.

  2. Preliminary analysis of the mitochondrial genome evolutionary pattern in primates

    Institute of Scientific and Technical Information of China (English)

    Liang ZHAO; Xingtao ZHANG; Xingkui TAO; Weiwei WANG; Ming LI

    2012-01-01

    Since the birth of molecular evolutionary analysis,primates have been a central focus of study and mitochondrial DNA is well suited to these endeavors because of its unique features.Surprisingly,to date no comprehensive evaluation of the nucleotide substitution patterns has been conducted on the mitochondrial genome of primates.Here,we analyzed the evolutionary patterns and evaluated selection and recombination in the mitochondrial genomes of 44 Primates species downloaded from GenBank.The results revealed that a strong rate heterogeneity occurred among sites and genes in all comparisons.Likewise,an obvious decline in primate nucleotide diversity was noted in the subunit rRNAs and tRNAs as compared to the protein-coding genes.Within 13 protein-coding genes,the pattern of nonsynonymous divergence was similar to that of overall nucleotide divergence,while synonymous changes differed only for individual genes,indicating that the rate heterogeneity may result from the rate of change at nonsynonymous sites.Codon usage analysis revealed that there was intermediate codon usage bias in primate protein-coding genes,and supported the idea that GC mutation pressure might determine codon usage and that positive selection is not the driving force for the codon usage bias.Neutrality tests using site-specific positive selection from a Bayesian framework indicated no sites were under positive selection for any gene,consistent with near neutrality.Recombination tests based on the pairwise homoplasy test statistic supported complete linkage even for much older divergent primate species.Thus,with the exception of rate heterogeneity among mitochondrial genes,evaluating the validity assumed complete linkage and selective neutrality in primates prior to phylogenetic or phylogeographic analysis seems unnecessary.

  3. Preliminary analysis of the mitochondrial genome evolutionary pattern in primates.

    Science.gov (United States)

    Zhao, Liang; Zhang, Xingtao; Tao, Xingkui; Wang, Weiwei; Li, Ming

    2012-08-01

    Since the birth of molecular evolutionary analysis, primates have been a central focus of study and mitochondrial DNA is well suited to these endeavors because of its unique features. Surprisingly, to date no comprehensive evaluation of the nucleotide substitution patterns has been conducted on the mitochondrial genome of primates. Here, we analyzed the evolutionary patterns and evaluated selection and recombination in the mitochondrial genomes of 44 Primates species downloaded from GenBank. The results revealed that a strong rate heterogeneity occurred among sites and genes in all comparisons. Likewise, an obvious decline in primate nucleotide diversity was noted in the subunit rRNAs and tRNAs as compared to the protein-coding genes. Within 13 protein-coding genes, the pattern of nonsynonymous divergence was similar to that of overall nucleotide divergence, while synonymous changes differed only for individual genes, indicating that the rate heterogeneity may result from the rate of change at nonsynonymous sites. Codon usage analysis revealed that there was intermediate codon usage bias in primate protein-coding genes, and supported the idea that GC mutation pressure might determine codon usage and that positive selection is not the driving force for the codon usage bias. Neutrality tests using site-specific positive selection from a Bayesian framework indicated no sites were under positive selection for any gene, consistent with near neutrality. Recombination tests based on the pairwise homoplasy test statistic supported complete linkage even for much older divergent primate species. Thus, with the exception of rate heterogeneity among mitochondrial genes, evaluating the validity assumed complete linkage and selective neutrality in primates prior to phylogenetic or phylogeographic analysis seems unnecessary.

  4. Genomic analysis of stress response against arsenic in Caenorhabditis elegans.

    Directory of Open Access Journals (Sweden)

    Surasri N Sahu

    Full Text Available Arsenic, a known human carcinogen, is widely distributed around the world and found in particularly high concentrations in certain regions including Southwestern US, Eastern Europe, India, China, Taiwan and Mexico. Chronic arsenic poisoning affects millions of people worldwide and is associated with increased risk of many diseases including arthrosclerosis, diabetes and cancer. In this study, we explored genome level global responses to high and low levels of arsenic exposure in Caenorhabditis elegans using Affymetrix expression microarrays. This experimental design allows us to do microarray analysis of dose-response relationships of global gene expression patterns. High dose (0.03% exposure caused stronger global gene expression changes in comparison with low dose (0.003% exposure, suggesting a positive dose-response correlation. Biological processes such as oxidative stress, and iron metabolism, which were previously reported to be involved in arsenic toxicity studies using cultured cells, experimental animals, and humans, were found to be affected in C. elegans. We performed genome-wide gene expression comparisons between our microarray data and publicly available C. elegans microarray datasets of cadmium, and sediment exposure samples of German rivers Rhine and Elbe. Bioinformatics analysis of arsenic-responsive regulatory networks were done using FastMEDUSA program. FastMEDUSA analysis identified cancer-related genes, particularly genes associated with leukemia, such as dnj-11, which encodes a protein orthologous to the mammalian ZRF1/MIDA1/MPP11/DNAJC2 family of ribosome-associated molecular chaperones. We analyzed the protective functions of several of the identified genes using RNAi. Our study indicates that C. elegans could be a substitute model to study the mechanism of metal toxicity using high-throughput expression data and bioinformatics tools such as FastMEDUSA.

  5. 13C metabolic flux analysis at a genome-scale.

    Science.gov (United States)

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  6. Comparative genomic analysis and phylogenetic position of Theileria equi

    Directory of Open Access Journals (Sweden)

    Kappmeyer Lowell S

    2012-11-01

    Full Text Available Abstract Background Transmission of arthropod-borne apicomplexan parasites that cause disease and result in death or persistent infection represents a major challenge to global human and animal health. First described in 1901 as Piroplasma equi, this re-emergent apicomplexan parasite was renamed Babesia equi and subsequently Theileria equi, reflecting an uncertain taxonomy. Understanding mechanisms by which apicomplexan parasites evade immune or chemotherapeutic elimination is required for development of effective vaccines or chemotherapeutics. The continued risk of transmission of T. equi from clinically silent, persistently infected equids impedes the goal of returning the U. S. to non-endemic status. Therefore comparative genomic analysis of T. equi was undertaken to: 1 identify genes contributing to immune evasion and persistence in equid hosts, 2 identify genes involved in PBMC infection biology and 3 define the phylogenetic position of T. equi relative to sequenced apicomplexan parasites. Results The known immunodominant proteins, EMA1, 2 and 3 were discovered to belong to a ten member gene family with a mean amino acid identity, in pairwise comparisons, of 39%. Importantly, the amino acid diversity of EMAs is distributed throughout the length of the proteins. Eight of the EMA genes were simultaneously transcribed. As the agents that cause bovine theileriosis infect and transform host cell PBMCs, we confirmed that T. equi infects equine PBMCs, however, there is no evidence of host cell transformation. Indeed, a number of genes identified as potential manipulators of the host cell phenotype are absent from the T. equi genome. Comparative genomic analysis of T. equi revealed the phylogenetic positioning relative to seven apicomplexan parasites using deduced amino acid sequences from 150 genes placed it as a sister taxon to Theileria spp. Conclusions The EMA family does not fit the paradigm for classical antigenic variation, and we propose a

  7. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the oth

  8. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae : Implications for the microbial "pan-genome"

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Donati, C; Medini, D; Ward, NL; Angiuoli, SV; Crabtree, J; Jones, AL; Durkin, AS; DeBoy, RT; Davidsen, TM; Mora, M; Scarselli, M; Ros, IMY; Peterson, JD; Hauser, CR; Sundaram, JP; Nelson, WC; Madupu, R; Brinkac, LM; Dodson, RJ; Rosovitz, MJ; Sullivan, SA; Daugherty, SC; Haft, DH; Selengut, J; Gwinn, ML; Zhou, LW; Zafar, N; Khouri, H; Radune, D; Dimitrov, G; Watkins, K; O'Connor, KJB; Smith, S; Utterback, TR; White, O; Rubens, CE; Grandi, G; Madoff, LC; Kasper, DL; Telford, JL; Wessels, MR; Rappuoli, R; Fraser, CM

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and als

  9. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae : Implications for the microbial "pan-genome"

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Donati, C; Medini, D; Ward, NL; Angiuoli, SV; Crabtree, J; Jones, AL; Durkin, AS; DeBoy, RT; Davidsen, TM; Mora, M; Scarselli, M; Ros, IMY; Peterson, JD; Hauser, CR; Sundaram, JP; Nelson, WC; Madupu, R; Brinkac, LM; Dodson, RJ; Rosovitz, MJ; Sullivan, SA; Daugherty, SC; Haft, DH; Selengut, J; Gwinn, ML; Zhou, LW; Zafar, N; Khouri, H; Radune, D; Dimitrov, G; Watkins, K; O'Connor, KJB; Smith, S; Utterback, TR; White, O; Rubens, CE; Grandi, G; Madoff, LC; Kasper, DL; Telford, JL; Wessels, MR; Rappuoli, R; Fraser, CM

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and als

  10. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the

  11. Genome Sizes of Nine Insect Species Determined by Flow Cytometry and k-mer Analysis

    Science.gov (United States)

    He, Kang; Lin, Kejian; Wang, Guirong; Li, Fei

    2016-01-01

    The flow cytometry method was used to estimate the genome sizes of nine agriculturally important insects, including two coleopterans, five Hemipterans, and two hymenopterans. Among which, the coleopteran Lissorhoptrus oryzophilus (Kuschel) had the largest genome of 981 Mb. The average genome size was 504 Mb, suggesting that insects have a moderate-size genome. Compared with the insects in other orders, hymenopterans had small genomes, which were averagely about ~200 Mb. We found that the genome sizes of four insect species were different between male and female, showing the organismal complexity of insects. The largest difference occurred in the coconut leaf beetle Brontispa longissima (Gestro). The male coconut leaf beetle had a 111 Mb larger genome than females, which might be due to the chromosome number difference between the sexes. The results indicated that insect invasiveness was not related to genome size. We also determined the genome sizes of the small brown planthopper Laodelphax striatellus (Fallén) and the parasitic wasp Macrocentrus cingulum (Brischke) using k-mer analysis with Illunima Solexa sequencing data. There were slight differences in the results from the two methods. k-mer analysis indicated that the genome size of L. striatellus was 500–700 Mb and that of M. cingulum was ~150 Mb. In all, the genome sizes information presented here should be helpful for designing the genome sequencing strategy when necessary. PMID:27932995

  12. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  13. Research study on analysis/use technologies of genome information; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    For wide use of genome information in the industrial field, the required R and D was surveyed from the standpoints of biology and information science. To clarify the present state and issues of the international research on genome analysis, the genome map as well as sequence and function information are first surveyed. The current analysis/use technologies of genome information are analyzed, and the following are summarized: prediction and identification of gene regions in genome sequences, techniques for searching and selecting useful genes, and techniques for predicting the expression of gene functions and the gene-product structure and functions. It is recommended that R and D and data collection/interpretation necessary to clarify inter-gene interactions and information networks should be promoted by integrating Japanese advanced know-how and technologies. As examples of the impact of the research results on industry and society, the present state and future expected effect are summarized for medicines, diagnosis/analysis instruments, chemicals, foods, agriculture, fishery, animal husbandry, electronics, environment and information. 278 refs., 42 figs., 5 tabs.

  14. Genomic analysis of smoothened inhibitor resistance in basal cell carcinoma.

    Science.gov (United States)

    Sharpe, Hayley J; Pau, Gregoire; Dijkgraaf, Gerrit J; Basset-Seguin, Nicole; Modrusan, Zora; Januario, Thomas; Tsui, Vickie; Durham, Alison B; Dlugosz, Andrzej A; Haverty, Peter M; Bourgon, Richard; Tang, Jean Y; Sarin, Kavita Y; Dirix, Luc; Fisher, David C; Rudin, Charles M; Sofen, Howard; Migden, Michael R; Yauch, Robert L; de Sauvage, Frederic J

    2015-03-09

    Smoothened (SMO) inhibitors are under clinical investigation for the treatment of several cancers. Vismodegib is approved for the treatment of locally advanced and metastatic basal cell carcinoma (BCC). Most BCC patients experience significant clinical benefit on vismodegib, but some develop resistance. Genomic analysis of tumor biopsies revealed that vismodegib resistance is associated with Hedgehog (Hh) pathway reactivation, predominantly through mutation of the drug target SMO and to a lesser extent through concurrent copy number changes in SUFU and GLI2. SMO mutations either directly impaired drug binding or activated SMO to varying levels. Furthermore, we found evidence for intra-tumor heterogeneity, suggesting that a combination of therapies targeting components at multiple levels of the Hh pathway is required to overcome resistance.

  15. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  16. Ethical considerations of research policy for personal genome analysis: the approach of the Genome Science Project in Japan.

    Science.gov (United States)

    Minari, Jusaku; Shirai, Tetsuya; Kato, Kazuto

    2014-12-01

    As evidenced by high-throughput sequencers, genomic technologies have recently undergone radical advances. These technologies enable comprehensive sequencing of personal genomes considerably more efficiently and less expensively than heretofore. These developments present a challenge to the conventional framework of biomedical ethics; under these changing circumstances, each research project has to develop a pragmatic research policy. Based on the experience with a new large-scale project-the Genome Science Project-this article presents a novel approach to conducting a specific policy for personal genome research in the Japanese context. In creating an original informed-consent form template for the project, we present a two-tiered process: making the draft of the template following an analysis of national and international policies; refining the draft template in conjunction with genome project researchers for practical application. Through practical use of the template, we have gained valuable experience in addressing challenges in the ethical review process, such as the importance of sharing details of the latest developments in genomics with members of research ethics committees. We discuss certain limitations of the conventional concept of informed consent and its governance system and suggest the potential of an alternative process using information technology.

  17. Comparative genomics analysis of rice and pineapple contributes to understand the chromosome number reduction and genomic changes in grasses

    Directory of Open Access Journals (Sweden)

    Jinpeng Wang

    2016-10-01

    Full Text Available Rice is one of the most researched model plant, and has a genome structure most resembling that of the grass common ancestor after a grass common tetraploidization ~100 million years ago. There has been a standing controversy whether there had been 5 or 7 basic chromosomes, before the tetraploidization, which were tackled but could not be well solved for the lacking of a sequenced and assembled outgroup plant to have a conservative genome structure. Recently, the availability of pineapple genome, which has not been subjected to the grass-common tetraploidization, provides a precious opportunity to solve the above controversy and to research into genome changes of rice and other grasses. Here, we performed a comparative genomics analysis of pineapple and rice, and found solid evidence that grass-common ancestor had 2n =2x =14 basic chromosomes before the tetraploidization and duplicated to 2n = 4x = 28 after the event. Moreover, we proposed that enormous gene missing from duplicated regions in rice should be explained by an allotetraploid produced by prominently divergent parental lines, rather than gene losses after their divergence. This means that genome fractionation might have occurred before the formation of the allotetraploid grass ancestor.

  18. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources.

  19. Exploring a Nonmodel Teleost Genome Through RAD Sequencing-Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis.

    Science.gov (United States)

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C; Tsigenopoulos, Costas S

    2015-12-29

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts.

  20. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    Directory of Open Access Journals (Sweden)

    Tereza Manousaki

    2016-03-01

    Full Text Available Common pandora (Pagellus erythrinus is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax, Nile tilapia (Oreochromis niloticus, stickleback (Gasterosteus aculeatus, and medaka (Oryzias latipes, suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts.

  1. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    Science.gov (United States)

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B.; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C.; Tsigenopoulos, Costas S.

    2015-01-01

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. PMID:26715088

  2. Functional Analysis of Shewanella, a cross genome comparison.

    Energy Technology Data Exchange (ETDEWEB)

    Serres, Margrethe H.

    2009-05-15

    The bacterial genus Shewanella includes a group of highly versatile organisms that have successfully adapted to life in many environments ranging from aquatic (fresh and marine) to sedimentary (lake and marine sediments, subsurface sediments, sea vent). A unique respiratory capability of the Shewanellas, initially observed for Shewanella oneidensis MR-1, is the ability to use metals and metalloids, including radioactive compounds, as electron acceptors. Members of the Shewanella genus have also been shown to degrade environmental pollutants i.e. halogenated compounds, making this group highly applicable for the DOE mission. S. oneidensis MR-1 has in addition been found to utilize a diverse set of nutrients and to have a large set of genes dedicated to regulation and to sensing of the environment. The sequencing of the S. oneidensis MR-1 genome facilitated experimental and bioinformatics analyses by a group of collaborating researchers, the Shewanella Federation. Through the joint effort and with support from Department of Energy S. oneidensis MR-1 has become a model organism of study. Our work has been a functional analysis of S. oneidensis MR-1, both by itself and as part of a comparative study. We have improved the annotation of gene products, assigned metabolic functions, and analyzed protein families present in S. oneidensis MR-1. The data has been applied to analysis of experimental data (i.e. gene expression, proteome) generated for S. oneidensis MR-1. Further, this work has formed the basis for a comparative study of over 20 members of the Shewanella genus. The species and strains selected for genome sequencing represented an evolutionary gradient of DNA relatedness, ranging from close to intermediate, and to distant. The organisms selected have also adapted to a variety of ecological niches. Through our work we have been able to detect and interpret genome similarities and differences between members of the genus. We have in this way contributed to the

  3. Identification of conserved regulatory elements by comparative genome analysis

    Directory of Open Access Journals (Sweden)

    Jareborg Niclas

    2003-05-01

    Full Text Available Abstract Background For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. Results We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. Conclusions Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.

  4. SIGMA2: A system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes

    Directory of Open Access Journals (Sweden)

    MacAulay Calum

    2008-10-01

    Full Text Available Abstract Background High throughput microarray technologies have afforded the investigation of genomes, epigenomes, and transcriptomes at unprecedented resolution. However, software packages to handle, analyze, and visualize data from these multiple 'omics disciplines have not been adequately developed. Results Here, we present SIGMA2, a system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes. Multi-dimensional datasets can be simultaneously visualized and analyzed with respect to each dimension, allowing combinatorial integration of the different assays belonging to the different 'omics. Conclusion The identification of genes altered at multiple levels such as copy number, loss of heterozygosity (LOH, DNA methylation and the detection of consequential changes in gene expression can be concertedly performed, establishing SIGMA2 as a novel tool to facilitate the high throughput systems biology analysis of cancer.

  5. 材料基因组技术前沿进展%Progress on Materials Genome Technology

    Institute of Scientific and Technical Information of China (English)

    向勇; 闫宗楷; 朱焱麟; 张晓琨

    2016-01-01

    Materials genome is an emerging technology to accelerate materials discovery, development, and deployment. In the past two decades, high-throughput materials experimentation tools have been developed and applied successfully to the discovery of a number of materials, ranging from advanced catalysts, dielectrics, electrodes, to high-temperature alloys. Materials computation and database technologies have also made remarkable progresses, particularly represented by the integrated computational materials engineering (ICME) developed in the past decade. Materials genome research integrates high-throughput computation and simulation, high-throughput experimentation, and materials database, throughout the materials discovery-to-deployment process, targeting to cut the materials development time and cost significantly. This review, is trying to give a brief and comprehensive introduction to materials genome technologies, with emphasis on high-throughput materials experimentation, as well as applications of materials computation and database. University of Electronics Science and Technology of China is one of the most active institutes in China in the filed of materials genome research, and some progresses are also highlighted in this review.%材料基因组技术是近年来兴起的材料研究新理念和新方法,是当今世界材料科学与工程领域的最前沿。材料基因组技术的实质是通过融合高通量材料计算设计、高通量材料实验和材料数据库三大组成要素,构建材料设计研发的协同创新网络,加速新材料从发现到应用的全过程。其中,高通量材料实验经过20多年的发展,目前已面向多种形态材料和多种服役性能形成了一系列成功案例,高通量材料计算模拟和材料数据库近几年也取得了较大进展。该文简要回顾了材料基因组技术的主要内容和发展历程,总结了具有代表性的高通量实验技术,以及高通

  6. Sequencing and comparative genome analysis of two pathogenic Streptococcus gallolyticus subspecies: genome plasticity, adaptation and virulence.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I and S. pasteurianus ATCC 43144 (biotype II.2. The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92% and 1607 (86% of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops.

  7. Whole genome sequence and comparative genomic sequence analysis of Helicoverpa armigera nucleopolyhedrovirus (HearNPV-L1) isolated from India.

    Science.gov (United States)

    Raghavendra, Ashika T; Jalali, Sushil K; Ojha, Rakshit; Shivalingaswamy, Timalapur M; Bhatnagar, Raj

    2017-03-01

    The whole genome of Helicoverpa armigera nucleopolyhedrovirus (HearNPV) from India, HearNPV-L1, was sequenced and analyzed, with a view to look for genes and/or nucleotide sequences that might be involved in the differences and virulence among other HearNPVs sequenced from other countries like SP1A (Spain), NNg1 (Kenya) and G4 (China). The entire nucleotide sequence of the HearNPV-L1 genome was 136,740 bp in length having GC content of 39.19% and contained 113 ORFs that could encode polypeptides with more than 50 amino acids (GenBank accession number KT013224). Two ORFs, viz., ORF 18 (300 bp) and ORF 19 (401 bp) identified were unique in HearNPV-L1 genome. Most of the HearNPV-L1 ORFs showed high similarity to NNg1, SP1A and G4 genomes. HearNPV-L1 genome contains 5 h (hr1-hr5), these regions were found 84-100% similar to hr region of NNg1, SP1A and G4 genomes. A total of four bro genes were observed in HearNPV-L1 genome, of which bro-a gene was 12 and 351 bp bigger than SP1A and G4 bro-a, respectively, while bro-b was 15 bp bigger SP1A and NNg1 bro-b, whereas 593 bp shorter than G4 bro-b, while bro-c was 12 bp shorter than NNg1, however bro-c was absent in G4 genome. HearNPV-L1 bro-d was 100% homologous to bro-d of SP1A, NNg1 and G4 genomes, respectively. The comparative analysis of HearNPV-L1 genome indicated that there are several other putative genes and nucleotide sequences that may be responsible for insecticidal activity in HearNPV-L1 isolate, however, further functional analysis of the hypothetical (putative) genes may help identifying the genes that are crucial for the virulence and insecticidal activity.

  8. Canine tumor cross-species genomics uncovers targets linked to osteosarcoma progression

    Directory of Open Access Journals (Sweden)

    Triche Timothy

    2009-12-01

    Full Text Available Abstract Background Pulmonary metastasis continues to be the most common cause of death in osteosarcoma. Indeed, the 5-year survival for newly diagnosed osteosarcoma patients has not significantly changed in over 20 years. Further understanding of the mechanisms of metastasis and resistance for this aggressive pediatric cancer is necessary. Pet dogs naturally develop osteosarcoma providing a novel opportunity to model metastasis development and progression. Given the accelerated biology of canine osteosarcoma, we hypothesized that a direct comparison of canine and pediatric osteosarcoma expression profiles may help identify novel metastasis-associated tumor targets that have been missed through the study of the human cancer alone. Results Using parallel oligonucleotide array platforms, shared orthologues between species were identified and normalized. The osteosarcoma expression signatures could not distinguish the canine and human diseases by hierarchical clustering. Cross-species target mining identified two genes, interleukin-8 (IL-8 and solute carrier family 1 (glial high affinity glutamate transporter, member 3 (SLC1A3, which were uniformly expressed in dog but not in all pediatric osteosarcoma patient samples. Expression of these genes in an independent population of pediatric osteosarcoma patients was associated with poor outcome (p = 0.020 and p = 0.026, respectively. Validation of IL-8 and SLC1A3 protein expression in pediatric osteosarcoma tissues further supported the potential value of these novel targets. Ongoing evaluation will validate the biological significance of these targets and their associated pathways. Conclusions Collectively, these data support the strong similarities between human and canine osteosarcoma and underline the opportunities provided by a comparative oncology approach as a means to improve our understanding of cancer biology and therapies.

  9. Adhesive Characterization and Progressive Damage Analysis of Bonded Composite Joints

    Science.gov (United States)

    Girolamo, Donato; Davila, Carlos G.; Leone, Frank A.; Lin, Shih-Yung

    2014-01-01

    The results of an experimental/numerical campaign aimed to develop progressive damage analysis (PDA) tools for predicting the strength of a composite bonded joint under tensile loads are presented. The PDA is based on continuum damage mechanics (CDM) to account for intralaminar damage, and cohesive laws to account for interlaminar and adhesive damage. The adhesive response is characterized using standard fracture specimens and digital image correlation (DIC). The displacement fields measured by DIC are used to calculate the J-integrals, from which the associated cohesive laws of the structural adhesive can be derived. A finite element model of a sandwich conventional splice joint (CSJ) under tensile loads was developed. The simulations indicate that the model is capable of predicting the interactions of damage modes that lead to the failure of the joint.

  10. Advanced Cancer Genomics Institute: Genetic Signatures and Therapeutic Targets in Cancer Progression

    Science.gov (United States)

    2015-04-01

    the Mcm3 gene locus in mouse embryo fibroblasts (MEF) by David Goodrich. This analysis shows novel Mcm3 promoter sites that bind Rb. Fig. 3-5...4. Develop transplantable castration-recurrent prostate cancer (CR-CaP) models of human and mouse prostate cancer lines in which androgen receptor

  11. Genome-wide analysis of alternative splicing in Chlamydomonas reinhardtii

    Directory of Open Access Journals (Sweden)

    Thomas Julie

    2010-02-01

    Full Text Available Abstract Background Genome-wide computational analysis of alternative splicing (AS in several flowering plants has revealed that pre-mRNAs from about 30% of genes undergo AS. Chlamydomonas, a simple unicellular green alga, is part of the lineage that includes land plants. However, it diverged from land plants about one billion years ago. Hence, it serves as a good model system to study alternative splicing in early photosynthetic eukaryotes, to obtain insights into the evolution of this process in plants, and to compare splicing in simple unicellular photosynthetic and non-photosynthetic eukaryotes. We performed a global analysis of alternative splicing in Chlamydomonas reinhardtii using its recently completed genome sequence and all available ESTs and cDNAs. Results Our analysis of AS using BLAT and a modified version of the Sircah tool revealed AS of 498 transcriptional units with 611 events, representing about 3% of the total number of genes. As in land plants, intron retention is the most prevalent form of AS. Retained introns and skipped exons tend to be shorter than their counterparts in constitutively spliced genes. The splice site signals in all types of AS events are weaker than those in constitutively spliced genes. Furthermore, in alternatively spliced genes, the prevalent splice form has a stronger splice site signal than the non-prevalent form. Analysis of constitutively spliced introns revealed an over-abundance of motifs with simple repetitive elements in comparison to introns involved in intron retention. In almost all cases, AS results in a truncated ORF, leading to a coding sequence that is around 50% shorter than the prevalent splice form. Using RT-PCR we verified AS of two genes and show that they produce more isoforms than indicated by EST data. All cDNA/EST alignments and splice graphs are provided in a website at http://combi.cs.colostate.edu/as/chlamy. Conclusions The extent of AS in Chlamydomonas that we observed is much

  12. Pattern Analysis and Decision Support for Cancer through Clinico-Genomic Profiles

    Science.gov (United States)

    Exarchos, Themis P.; Giannakeas, Nikolaos; Goletsis, Yorgos; Papaloukas, Costas; Fotiadis, Dimitrios I.

    Advances in genome technology are playing a growing role in medicine and healthcare. With the development of new technologies and opportunities for large-scale analysis of the genome, genomic data have a clear impact on medicine. Cancer prognostics and therapeutics are among the first major test cases for genomic medicine, given that all types of cancer are related with genomic instability. In this paper we present a novel system for pattern analysis and decision support in cancer. The system integrates clinical data from electronic health records and genomic data. Pattern analysis and data mining methods are applied to these integrated data and the discovered knowledge is used for cancer decision support. Through this integration, conclusions can be drawn for early diagnosis, staging and cancer treatment.

  13. Analysis of pan-genome content and its application in microbial identification

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana

    of genomic data and use this to answer important biological questions. More specifically, comparison of prokaryotic proteomes is used to determine possible sets of functions, essential to sustain microbial life; to extract and interpret similarities and variance in genomic content within different taxonomic...... analyses for the characterization of two Listeria monocytogenes strains. Chapter 4 describes the use of profile HMMs for comparative analysis using for sequence-based homology searches. Paper III introduces PanFunPro a new, profile HMM-based method for pan-genome analysis. Paper IV illustrates...... the application of PanFunPro to a set of more than 2000 genomes; this paper aims to define set of protein families, which are conserved among all the genomes. Papers V demonstrates comparative genomics analysis of proteomes, belonging to Vibrio genus. In the last project, described in Chapter 5, both BLAST...

  14. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes.

    Science.gov (United States)

    Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

    2014-12-19

    Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution.

  15. Whole genome sequence analysis suggests intratumoral heterogeneity in dissemination of breast cancer to lymph nodes.

    Directory of Open Access Journals (Sweden)

    Kevin Blighe

    Full Text Available BACKGROUND: Intratumoral heterogeneity may help drive resistance to targeted therapies in cancer. In breast cancer, the presence of nodal metastases is a key indicator of poorer overall survival. The aim of this study was to identify somatic genetic alterations in early dissemination of breast cancer by whole genome next generation sequencing (NGS of a primary breast tumor, a matched locally-involved axillary lymph node and healthy normal DNA from blood. METHODS: Whole genome NGS was performed on 12 µg (range 11.1-13.3 µg of DNA isolated from fresh-frozen primary breast tumor, axillary lymph node and peripheral blood following the DNA nanoball sequencing protocol. Single nucleotide variants, insertions, deletions, and substitutions were identified through a bioinformatic pipeline and compared to CIN25, a key set of genes associated with tumor metastasis. RESULTS: Whole genome sequencing revealed overlapping variants between the tumor and node, but also variants that were unique to each. Novel mutations unique to the node included those found in two CIN25 targets, TGIF2 and CCNB2, which are related to transcription cyclin activity and chromosomal stability, respectively, and a unique frameshift in PDS5B, which is required for accurate sister chromatid segregation during cell division. We also identified dominant clonal variants that progressed from tumor to node, including SNVs in TP53 and ARAP3, which mediates rearrangements to the cytoskeleton and cell shape, and an insertion in TOP2A, the expression of which is significantly associated with tumor proliferation and can segregate breast cancers by outcome. CONCLUSION: This case study provides preliminary evidence that primary tumor and early nodal metastasis have largely overlapping somatic genetic alterations. There were very few mutations unique to the involved node. However, significant conclusions regarding early dissemination needs analysis of a larger number of patient samples.

  16. Genomic and single nucleotide polymorphism analysis of infectious bronchitis coronavirus.

    Science.gov (United States)

    Abolnik, Celia

    2015-06-01

    Infectious bronchitis virus (IBV) is a Gammacoronavirus that causes a highly contagious respiratory disease in chickens. A QX-like strain was analysed by high-throughput Illumina sequencing and genetic variation across the entire viral genome was explored at the sub-consensus level by single nucleotide polymorphism (SNP) analysis. Thirteen open reading frames (ORFs) in the order 5'-UTR-1a-1ab-S-3a-3b-E-M-4b-4c-5a-5b-N-6b-3'UTR were predicted. The relative frequencies of missense: silent SNPs were calculated to obtain a comparative measure of variability in specific genes. The most variable ORFs in descending order were E, 3b, 5'UTR, N, 1a, S, 1ab, M, 4c, 5a, 6b. The E and 3b protein products play key roles in coronavirus virulence, and RNA folding demonstrated that the mutations in the 5'UTR did not alter the predicted secondary structure. The frequency of SNPs in the Spike (S) protein ORF of 0.67% was below the genomic average of 0.76%. Only three SNPS were identified in the S1 subunit, none of which were located in hypervariable region (HVR) 1 or HVR2. The S2 subunit was considerably more variable containing 87% of the polymorphisms detected across the entire S protein. The S2 subunit also contained a previously unreported multi-A insertion site and a stretch of four consecutive mutated amino acids, which mapped to the stalk region of the spike protein. Template-based protein structure modelling produced the first theoretical model of the IBV spike monomer. Given the lack of diversity observed at the sub-consensus level, the tenet that the HVRs in the S1 subunit are very tolerant of amino acid changes produced by genetic drift is questioned. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Cost analysis of whole genome sequencing in German clinical practice.

    Science.gov (United States)

    Plöthner, Marika; Frank, Martin; von der Schulenburg, J-Matthias Graf

    2017-06-01

    Whole genome sequencing (WGS) is an emerging tool in clinical diagnostics. However, little has been said about its procedure costs, owing to a dearth of related cost studies. This study helps fill this research gap by analyzing the execution costs of WGS within the setting of German clinical practice. First, to estimate costs, a sequencing process related to clinical practice was undertaken. Once relevant resources were identified, a quantification and monetary evaluation was conducted using data and information from expert interviews with clinical geneticists, and personnel at private enterprises and hospitals. This study focuses on identifying the costs associated with the standard sequencing process, and the procedure costs for a single WGS were analyzed on the basis of two sequencing platforms-namely, HiSeq 2500 and HiSeq Xten, both by Illumina, Inc. In addition, sensitivity analyses were performed to assess the influence of various uses of sequencing platforms and various coverage values on a fixed-cost degression. In the base case scenario-which features 80 % utilization and 30-times coverage-the cost of a single WGS analysis with the HiSeq 2500 was estimated at €3858.06. The cost of sequencing materials was estimated at €2848.08; related personnel costs of €396.94 and acquisition/maintenance costs (€607.39) were also found. In comparison, the cost of sequencing that uses the latest technology (i.e., HiSeq Xten) was approximately 63 % cheaper, at €1411.20. The estimated costs of WGS currently exceed the prediction of a 'US$1000 per genome', by more than a factor of 3.8. In particular, the material costs in themselves exceed this predicted cost.

  18. Genomic risk profiling of ischemic stroke: results of an international genome-wide association meta-analysis.

    Directory of Open Access Journals (Sweden)

    James F Meschia

    Full Text Available INTRODUCTION: Familial aggregation of ischemic stroke derives from shared genetic and environmental factors. We present a meta-analysis of genome-wide association scans (GWAS from 3 cohorts to identify the contribution of common variants to ischemic stroke risk. METHODS: This study involved 1464 ischemic stroke cases and 1932 controls. Cases were genotyped using the Illumina 610 or 660 genotyping arrays; controls, with Illumina HumanHap 550Kv1 or 550Kv3 genotyping arrays. Imputation was performed with the 1000 Genomes European ancestry haplotypes (August 2010 release as a reference. A total of 5,156,597 single-nucleotide polymorphisms (SNPs were incorporated into the fixed effects meta-analysis. All SNPs associated with ischemic stroke (P<1×10(-5 were incorporated into a multivariate risk profile model. RESULTS: No SNP reached genome-wide significance for ischemic stroke (P<5×10(-8. Secondary analysis identified a significant cumulative effect for age at onset of stroke (first versus fifth quintile of cumulative profiles based on SNPs associated with late onset, ß = 14.77 [10.85,18.68], P = 5.5×10(-12, as well as a strong effect showing increased risk across samples with a high propensity for stroke among samples with enriched counts of suggestive risk alleles (P<5×10(-6. Risk profile scores based only on genomic information offered little incremental prediction. DISCUSSION: There is little evidence of a common genetic variant contributing to moderate risk of ischemic stroke. Quintiles based on genetic loading of alleles associated with a younger age at onset of ischemic stroke revealed a significant difference in age at onset between those in the upper and lower quintiles. Using common variants from GWAS and imputation, genomic profiling remains inferior to family history of stroke for defining risk. Inclusion of genomic (rare variant information may be required to improve clinical risk profiling.

  19. Genome-wide analysis reveals coating of the mitochondrial genome by TFAM.

    Directory of Open Access Journals (Sweden)

    Yun E Wang

    Full Text Available Mitochondria contain a 16.6 kb circular genome encoding 13 proteins as well as mitochondrial tRNAs and rRNAs. Copies of the genome are organized into nucleoids containing both DNA and proteins, including the machinery required for mtDNA replication and transcription. The transcription factor TFAM is critical for initiation of transcription and replication of the genome, and is also thought to perform a packaging function. Although specific binding sites required for initiation of transcription have been identified in the D-loop, little is known about the characteristics of TFAM binding in its nonspecific packaging state. In addition, it is unclear whether TFAM also plays a role in the regulation of nuclear gene expression. Here we investigate these questions by using ChIP-seq to directly localize TFAM binding to DNA in human cells. Our results demonstrate that TFAM uniformly coats the whole mitochondrial genome, with no evidence of robust TFAM binding to the nuclear genome. Our study represents the first high-resolution assessment of TFAM binding on a genome-wide scale in human cells.

  20. Evaluation of Genomic Instability as an Early Event in the Progression of Breast Cancer

    Science.gov (United States)

    2008-04-01

    instability allows additional classifying of the known aneuploid, diploid, and tetraploid categories of primary breast adenocarcinomas into low and high...expression analysis using stromal and epithelial cell RNA from CHN tissues by microarray hybridization . Determine molecular signatures as a function of...microarray hybridization experiments where performed on “bulk” breast tissues 1cm from tumor margin (N=5), breast tissues 5cm from tumor margin (N=5), and

  1. Genome-Wide Analysis of DNA Methylation in Human Amnion

    Directory of Open Access Journals (Sweden)

    Jinsil Kim

    2013-01-01

    Full Text Available The amnion is a specialized tissue in contact with the amniotic fluid, which is in a constantly changing state. To investigate the importance of epigenetic events in this tissue in the physiology and pathophysiology of pregnancy, we performed genome-wide DNA methylation profiling of human amnion from term (with and without labor and preterm deliveries. Using the Illumina Infinium HumanMethylation27 BeadChip, we identified genes exhibiting differential methylation associated with normal labor and preterm birth. Functional analysis of the differentially methylated genes revealed biologically relevant enriched gene sets. Bisulfite sequencing analysis of the promoter region of the oxytocin receptor (OXTR gene detected two CpG dinucleotides showing significant methylation differences among the three groups of samples. Hypermethylation of the CpG island of the solute carrier family 30 member 3 (SLC30A3 gene in preterm amnion was confirmed by methylation-specific PCR. This work provides preliminary evidence that DNA methylation changes in the amnion may be at least partially involved in the physiological process of labor and the etiology of preterm birth and suggests that DNA methylation profiles, in combination with other biological data, may provide valuable insight into the mechanisms underlying normal and pathological pregnancies.

  2. Rice–arsenate interactions in hydroponics: whole genome transcriptional analysis

    Science.gov (United States)

    Norton, Gareth J.; Lou-Hing, Daniel E.; Meharg, Andrew A.; Price, Adam H.

    2008-01-01

    Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 μM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the Bala×Azucena mapping population. PMID:18453530

  3. Rice-arsenate interactions in hydroponics: whole genome transcriptional analysis.

    Science.gov (United States)

    Norton, Gareth J; Lou-Hing, Daniel E; Meharg, Andrew A; Price, Adam H

    2008-01-01

    Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 muM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the BalaxAzucena mapping population.

  4. Genome-wide transcriptome analysis of 150 cell samples†

    Science.gov (United States)

    Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G.; Davis, Ronald W.; Toner, Mehmet

    2013-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples. PMID:20023796

  5. Genome-wide transcriptome analysis of 150 cell samples.

    Science.gov (United States)

    Irimia, Daniel; Mindrinos, Michael; Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G; Davis, Ronald W; Toner, Mehmet

    2009-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples.

  6. Genome-Wide Analysis of DNA Methylation in Human Amnion

    Science.gov (United States)

    Kim, Jinsil; Pitlick, Mitchell M.; Christine, Paul J.; Schaefer, Amanda R.; Saleme, Cesar; Comas, Belén; Cosentino, Viviana; Gadow, Enrique; Murray, Jeffrey C.

    2013-01-01

    The amnion is a specialized tissue in contact with the amniotic fluid, which is in a constantly changing state. To investigate the importance of epigenetic events in this tissue in the physiology and pathophysiology of pregnancy, we performed genome-wide DNA methylation profiling of human amnion from term (with and without labor) and preterm deliveries. Using the Illumina Infinium HumanMethylation27 BeadChip, we identified genes exhibiting differential methylation associated with normal labor and preterm birth. Functional analysis of the differentially methylated genes revealed biologically relevant enriched gene sets. Bisulfite sequencing analysis of the promoter region of the oxytocin receptor (OXTR) gene detected two CpG dinucleotides showing significant methylation differences among the three groups of samples. Hypermethylation of the CpG island of the solute carrier family 30 member 3 (SLC30A3) gene in preterm amnion was confirmed by methylation-specific PCR. This work provides preliminary evidence that DNA methylation changes in the amnion may be at least partially involved in the physiological process of labor and the etiology of preterm birth and suggests that DNA methylation profiles, in combination with other biological data, may provide valuable insight into the mechanisms underlying normal and pathological pregnancies. PMID:23533356

  7. Improved statistics for genome-wide interaction analysis.

    Science.gov (United States)

    Ueki, Masao; Cordell, Heather J

    2012-01-01

    Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new "joint effects" statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al

  8. Development and characterization of genomic and expressed SSRs in citrus by genome-wide analysis.

    Directory of Open Access Journals (Sweden)

    Sheng-Rui Liu

    Full Text Available Microsatellites or simple sequence repeats (SSRs are one of the most popular sources of genetic markers and play a significant role in plant genetics and breeding. In this study, we identified citrus SSRs in the genome of Clementine mandarin and analyzed their frequency and distribution in different genomic regions. A total of 80,708 SSRs were detected in the genome with an overall density of 268 SSRs/Mb. While di-nucleotide repeats were the most frequent microsatellites in genomic DNA sequence, tetra-nucleotides, which had more repeat units than any other SSR types, had the highest cumulative sequence length. We identified 6,834 transcripts as containing 8,989 SSRs in 33,929 Clementine mandarin transcripts, among which, tri-nucleotide motifs (36.0% were the most common, followed by di-nucleotide (26.9% and hexa-nucleotide motifs (15.1%. The motif AG (16.7% was most abundant among these SSRs, while motifs AAG (6.6%, AAT (5.0%, and TAG (2.2% were most common among tri-nucleotides. Functional categorization of transcripts containing SSRs revealed that 5,879 (86.0% of such transcripts had homology with known proteins, GO and KEGG annotation revealed that transcripts containing SSRs were those implicated in diverse biological processes in plants, including binding, development, transcription, and protein degradation. When 27 genomic and 78 randomly selected SSRs were tested on Clementine mandarin, 95 SSRs revealed polymorphism. These 95 SSRs were further deployed on 18 genotypes of the three generas of Rutaceae for the genetic diversity assessment, genomic SSRs generally show low transferability in comparison to SSRs developed from expressed sequences. These transcript-markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in citrus, such as diversity study, QTL mapping, molecular breeding, comparative mapping and other genetic analyses.

  9. Development and characterization of genomic and expressed SSRs in citrus by genome-wide analysis.

    Science.gov (United States)

    Liu, Sheng-Rui; Li, Wen-Yang; Long, Dang; Hu, Chun-Gen; Zhang, Jin-Zhi

    2013-01-01

    Microsatellites or simple sequence repeats (SSRs) are one of the most popular sources of genetic markers and play a significant role in plant genetics and breeding. In this study, we identified citrus SSRs in the genome of Clementine mandarin and analyzed their frequency and distribution in different genomic regions. A total of 80,708 SSRs were detected in the genome with an overall density of 268 SSRs/Mb. While di-nucleotide repeats were the most frequent microsatellites in genomic DNA sequence, tetra-nucleotides, which had more repeat units than any other SSR types, had the highest cumulative sequence length. We identified 6,834 transcripts as containing 8,989 SSRs in 33,929 Clementine mandarin transcripts, among which, tri-nucleotide motifs (36.0%) were the most common, followed by di-nucleotide (26.9%) and hexa-nucleotide motifs (15.1%). The motif AG (16.7%) was most abundant among these SSRs, while motifs AAG (6.6%), AAT (5.0%), and TAG (2.2%) were most common among tri-nucleotides. Functional categorization of transcripts containing SSRs revealed that 5,879 (86.0%) of such transcripts had homology with known proteins, GO and KEGG annotation revealed that transcripts containing SSRs were those implicated in diverse biological processes in plants, including binding, development, transcription, and protein degradation. When 27 genomic and 78 randomly selected SSRs were tested on Clementine mandarin, 95 SSRs revealed polymorphism. These 95 SSRs were further deployed on 18 genotypes of the three generas of Rutaceae for the genetic diversity assessment, genomic SSRs generally show low transferability in comparison to SSRs developed from expressed sequences. These transcript-markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in citrus, such as diversity study, QTL mapping, molecular breeding, comparative mapping and other genetic analyses.

  10. Inverted Low-Copy Repeats and Genome Instability—A Genome-Wide Analysis

    Science.gov (United States)

    Dittwald, Piotr; Gambin, Tomasz; Gonzaga-Jauregui, Claudia; Carvalho, Claudia M.B.; Lupski, James R.; Stankiewicz, Paweł; Gambin, Anna

    2013-01-01

    Inverse paralogous low-copy repeats (IP-LCRs) can cause genome instability by nonallelic homologous recombination (NAHR)-mediated balanced inversions. When disrupting a dosage-sensitive gene(s), balanced inversions can lead to abnormal phenotypes. We delineated the genome-wide distribution of IP-LCRs >1 kB in size with >95% sequence identity and mapped the genes, potentially intersected by an inversion, that overlap at least one of the IP-LCRs. Remarkably, our results show that 12.0% of the human genome is potentially susceptible to such inversions and 942 genes, 99 of which are on the X chromosome, are predicted to be disrupted secondary to such an inversion! In addition, IP-LCRs larger than 800 bp with at least 98% sequence identity (duplication/triplication facilitating IP-LCRs, DTIP-LCRs) were recently implicated in the formation of complex genomic rearrangements with a duplication-inverted triplication–duplication (DUP-TRP/INV-DUP) structure by a replication-based mechanism involving a template switch between such inverted repeats. We identified 1,551 DTIP-LCRs that could facilitate DUP-TRP/INV-DUP formation. Remarkably, 1,445 disease-associated genes are at risk of undergoing copy-number gain as they map to genomic intervals susceptible to the formation of DUP-TRP/INV-DUP complex rearrangements. We implicate inverted LCRs as a human genome architectural feature that could potentially be responsible for genomic instability associated with many human disease traits. PMID:22965494

  11. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  12. Complete genome sequence of Borrelia afzelii K78 and comparative genome analysis.

    Directory of Open Access Journals (Sweden)

    Wolfgang Schüler

    Full Text Available The main Borrelia species causing Lyme borreliosis in Europe and Asia are Borrelia afzelii, B. garinii, B. burgdorferi and B. bavariensis. This is in contrast to the United States, where infections are exclusively caused by B. burgdorferi. Until to date the genome sequences of four B. afzelii strains, of which only two include the numerous plasmids, are available. In order to further assess the genetic diversity of B. afzelii, the most common species in Europe, responsible for the large variety of clinical manifestations of Lyme borreliosis, we have determined the full genome sequence of the B. afzelii strain K78, a clinical isolate from Austria. The K78 genome contains a linear chromosome (905,949 bp and 13 plasmids (8 linear and 5 circular together presenting 1,309 open reading frames of which 496 are located on plasmids. With the exception of lp28-8, all linear replicons in their full length including their telomeres have been sequenced. The comparison with the genomes of the four other B. afzelii strains, ACA-1, PKo, HLJ01 and Tom3107, as well as the one of B. burgdorferi strain B31, confirmed a high degree of conservation within the linear chromosome of B. afzelii, whereas plasmid encoded genes showed a much larger diversity. Since some plasmids present in B. burgdorferi are missing in the B. afzelii genomes, the corresponding virulence factors of B. burgdorferi are found in B. afzelii on other unrelated plasmids. In addition, we have identified a species specific region in the circular plasmid, cp26, which could be used for species determination. Different non-coding RNAs have been located on the B. afzelii K78 genome, which have not previously been annotated in any of the published Borrelia genomes.

  13. Complete sequence of the mitochondrial genome of a diatom alga Synedra acus and comparative analysis of diatom mitochondrial genomes.

    Science.gov (United States)

    Ravin, Nikolai V; Galachyants, Yuri P; Mardanov, Andrey V; Beletsky, Alexey V; Petrova, Darya P; Sherbakova, Tatyana A; Zakharova, Yuliya R; Likhoshway, Yelena V; Skryabin, Konstantin G; Grachev, Mikhail A

    2010-06-01

    The first two mitochondrial genomes of marine diatoms were previously reported for the centric Thalassiosira pseudonana and the raphid pennate Phaeodactylum tricornutum. As part of a genomic project, we sequenced the complete mitochondrial genome of the freshwater araphid pennate diatom Synedra acus. This 46,657 bp mtDNA encodes 2 rRNAs, 24 tRNAs, and 33 proteins. The mtDNA of S. acus contains three group II introns, two inserted into the cox1 gene and containing ORFs, and one inserted into the rnl gene and lacking an ORF. The compact gene organization contrasts with the presence of a 4.9-kb-long intergenic region, which contains repeat sequences. Comparison of the three sequenced mtDNAs showed that these three genomes carry similar gene pools, but the positions of some genes are rearranged. Phylogenetic analysis performed with a fragment of the cox1 gene of diatoms and other heterokonts produced a tree that is similar to that derived from 18S RNA genes. The introns of mtDNA in the diatoms seem to be polyphyletic. This study demonstrates that pyrosequencing is an efficient method for complete sequencing of mitochondrial genomes from diatoms, and may soon give valuable information about the molecular phylogeny of this outstanding group of unicellular organisms.

  14. Identification of genomic aberrations associated with disease transformation by means of high-resolution SNP array analysis in patients with myeloproliferative neoplasm.

    Science.gov (United States)

    Rumi, Elisa; Harutyunyan, Ashot; Elena, Chiara; Pietra, Daniela; Klampfl, Thorsten; Bagienski, Klaudia; Berg, Tiina; Casetti, Ilaria; Pascutto, Cristiana; Passamonti, Francesco; Kralovics, Robert; Cazzola, Mario

    2011-12-01

    Myeloproliferative neoplasms (MPN) include polycythemia vera (PV), essential thrombocythemia (ET), and primary myelofibrosis (PMF). These disorders may undergo phenotypic shifts, and may specifically evolve into secondary myelofibrosis (MF) or acute myeloid leukemia (AML). We studied genomic changes associated with these transformations in 29 patients who had serial samples collected in different phases of disease. Genomic DNA from granulocytes, i.e., the myeloproliferative genome, was processed and hybridized to genome-wide human SNP 6.0 arrays. Most patients in chronic phase had chromosomal regions with uniparental disomy (UPD) and/or copy number changes. Disease progression to secondary MF or AML was associated with the acquisition of additional chromosomal aberrations in granulocytes (P = 0.002). A close relationship was observed between aberrations of chromosome 9p (UPD and/or gain) and progression from PV to post-PV MF (P = 0.002). The acquisition of one or more aberrations involving chromosome 5, 7, or 17p was specifically associated with progression to AML (OR 5.9, 95% CI 1.2-27.7, P = 0.006), and significantly affected overall survival (HR 18, 95% CI 1.9-164, P = 0.01). These observations indicate that disease progression from chronic-phase MPN to secondary MF or AML is associated with specific chromosomal aberrations that can be detected by means of high-resolution SNP array analysis of granulocyte DNA. Copyright © 2011 Wiley-Liss, Inc.

  15. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    Directory of Open Access Journals (Sweden)

    Gurusamy Raman

    Full Text Available Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC region (82,805 bp, with some variations in the inverted repeat region A (IRA/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19 was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA and ribosomal protein subunit L23 (rpl23 genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  16. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    Science.gov (United States)

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  17. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D' Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  18. CoCoNUT: an efficient system for the comparison and analysis of genomes

    Directory of Open Access Journals (Sweden)

    Kurtz Stefan

    2008-11-01

    Full Text Available Abstract Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit that allows solving several different tasks in a unified framework: (1 finding regions of high similarity among multiple genomic sequences and aligning them, (2 comparing two draft or multi-chromosomal genomes, (3 locating large segmental duplications in large genomic sequences, and (4 mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component, CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics.

  19. Genetic Biomarkers of Barrett's Esophagus Susceptibility and Progression to Dysplasia and Cancer: A Systematic Review and Meta-Analysis.

    Science.gov (United States)

    Findlay, John M; Middleton, Mark R; Tomlinson, Ian

    2016-01-01

    Barrett's esophagus (BE) is a common and important precursor lesion of esophageal adenocarcinoma (EAC). A third of patients with BE are asymptomatic, and our ability to predict the risk of progression of metaplasia to dysplasia and EAC (and therefore guide management) is limited. There is an urgent need for clinically useful biomarkers of susceptibility to both BE and risk of subsequent progression. This study aims to systematically identify, review, and meta-analyze genetic biomarkers reported to predict both. A systematic review of the PubMed and EMBASE databases was performed in May 2014. Study and evidence quality were appraised using the revised American Society of Clinical Oncology guidelines, and modified Recommendations for Tumor Marker Scores. Meta-analysis was performed for all markers assessed by more than one study. A total of 251 full-text articles were reviewed; 52 were included. A total of 33 germline markers of susceptibility were identified (level of evidence II-III); 17 were included. Five somatic markers of progression were identified; meta-analysis demonstrated significant associations for chromosomal instability (level of evidence II). One somatic marker of progression/relapse following photodynamic therapy was identified. However, a number of failings of methodology and reporting were identified. This is the first systematic review and meta-analysis to evaluate genetic biomarkers of BE susceptibility and risk of progression. While a number of limitations of study quality temper the utility of those markers identified, some-in particular, those identified by genome-wide association studies, and chromosomal instability for progression-appear plausible, although robust validation is required.

  20. High resolution microarray comparative genomic hybridisation analysis using spotted oligonucleotides.

    NARCIS (Netherlands)

    Carvalho, B; Ouwerkerk, E; Meijer, G.A.; Ylstra, B.

    2004-01-01

    BACKGROUND: Currently, comparative genomic hybridisation array (array CGH) is the method of choice for studying genome wide DNA copy number changes. To date, either amplified representations of bacterial artificial chromosomes (BACs)/phage artificial chromosomes (PACs) or cDNAs have been spotted as

  1. Full-length genomic analysis of korean porcine sapelovirus strains

    DEFF Research Database (Denmark)

    Son, Kyu-Yeol; Kim, Deok-Song; Kwon, Joseph

    2014-01-01

    the structural features of PSV genomes, the full-length nucleotide sequences of three Korean PSV strains were determined and analyzed using bioinformatic techniques in comparison with other known PSV strains. The Korean PSV genomes ranged from 7,542 to 7,566 nucleotides excluding the 3' poly(A) tail, and showed...

  2. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  3. Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms

    Directory of Open Access Journals (Sweden)

    Meller Jaroslaw

    2007-03-01

    Full Text Available Abstract Background Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. Results We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance. In particular, Cinteny provides: i integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii flexibility to adjust the parameters and re-compute the results on-the-fly; iii ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at http://cinteny.cchmc.org. Conclusion Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances

  4. Single cell genome analysis of an uncultured heterotrophic stramenopile

    Science.gov (United States)

    Roy, Rajat S.; Price, Dana C.; Schliep, Alexander; Cai, Guohong; Korobeynikov, Anton; Yoon, Hwan Su; Yang, Eun Chan; Bhattacharya, Debashish

    2014-04-01

    A broad swath of eukaryotic microbial biodiversity cannot be cultivated in the lab and is therefore inaccessible to conventional genome-wide comparative methods. One promising approach to study these lineages is single cell genomics (SCG), whereby an individual cell is captured from nature and genome data are produced from the amplified total DNA. Here we tested the efficacy of SCG to generate a draft genome assembly from a single sample, in this case a cell belonging to the broadly distributed MAST-4 uncultured marine stramenopiles. Using de novo gene prediction, we identified 6,996 protein-encoding genes in the MAST-4 genome. This genetic inventory was sufficient to place the cell within the ToL using multigene phylogenetics and provided preliminary insights into the complex evolutionary history of horizontal gene transfer (HGT) in the MAST-4 lineage.

  5. Genomic Islands Prediction and Analysis in Cyanobacteira by Bioinfomatics

    Institute of Scientific and Technical Information of China (English)

    Yi Li; Ni-Ni Rao; Feng Yang; Han-Ming Liu

    2014-01-01

    Genomic islands (Gis) are one of the most important components for cyanobacterial genome. The Gis code has many functions, such as symbiosis, pathogenesis, and adaptation. In this article, we predict and analyze the Gis in Synechocystis sp. PCC 6803 by bioinfomatics, and the results show that ISL1, ISL8, and ISL16 are homologous with many other bacteria, and they involve in basic reactions and have a conservative evolution. On the contrary, ISL15 has a unique sequence and function only for Synechocystis sp. PCC 6803. Most of Gis play a role in genome rearrangement because they have lots of transposase. Moreover, we find that recombination and horizontal transfer of Gis are important factors to affect the distribution of non-coding RNA. Our work contributes to a comprehensive understanding of genomic islands and their impact on genome of cyanobacteria.

  6. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    Science.gov (United States)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259–93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  7. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    Institute of Scientific and Technical Information of China (English)

    YANG Hui; Zhang YaPing

    2007-01-01

    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  8. Genomic analysis by oligonucleotide array Comparative Genomic Hybridization utilizing formalin-fixed, paraffin-embedded tissues.

    Science.gov (United States)

    Savage, Stephanie J; Hostetter, Galen

    2011-01-01

    Formalin fixation has been used to preserve tissues for more than a hundred years, and there are currently more than 300 million archival samples in the United States alone. The application of genomic protocols such as high-density oligonucleotide array Comparative Genomic Hybridization (aCGH) to formalin-fixed, paraffin-embedded (FFPE) tissues, therefore, opens an untapped resource of available tissues for research and facilitates utilization of existing clinical data in a research sample set. However, formalin fixation results in cross-linking of proteins and DNA, typically leading to such a significant degradation of DNA template that little is available for use in molecular applications. Here, we describe a protocol to circumvent formalin fixation artifact by utilizing enzymatic reactions to obtain quality DNA from a wide range of FFPE tissues for successful genome-wide discovery of gene dosage alterations in archival clinical samples.

  9. In silico comparative genomic analysis of GABAA receptor transcriptional regulation

    Directory of Open Access Journals (Sweden)

    Joyce Christopher J

    2007-06-01

    Full Text Available Abstract Background Subtypes of the GABAA receptor subunit exhibit diverse temporal and spatial expression patterns. In silico comparative analysis was used to predict transcriptional regulatory features in individual mammalian GABAA receptor subunit genes, and to identify potential transcriptional regulatory components involved in the coordinate regulation of the GABAA receptor gene clusters. Results Previously unreported putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunit genes. Putative core elements and proximal transcriptional factors were identified within these predicted promoters, and within the experimentally determined promoters of other subunit genes. Conserved intergenic regions of sequence in the mammalian GABAA receptor gene cluster comprising the α1, β2, γ2 and α6 subunits were identified as potential long range transcriptional regulatory components involved in the coordinate regulation of these genes. A region of predicted DNase I hypersensitive sites within the cluster may contain transcriptional regulatory features coordinating gene expression. A novel model is proposed for the coordinate control of the gene cluster and parallel expression of the α1 and β2 subunits, based upon the selective action of putative Scaffold/Matrix Attachment Regions (S/MARs. Conclusion The putative regulatory features identified by genomic analysis of GABAA receptor genes were substantiated by cross-species comparative analysis and now require experimental verification. The proposed model for the coordinate regulation of genes in the cluster accounts for the head-to-head orientation and parallel expression of the α1 and β2 subunit genes, and for the disruption of transcription caused by insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a putative critical S/MAR.

  10. A structural genomics analysis of histidine kinase sensor domains

    Science.gov (United States)

    Cheung, Jonah

    2005-11-01

    Histidine kinase sensors are a part of a two-component system of protein signaling in prokaryotes and lower eukaryotes that relay an external environmental signal to an adaptive internal cellular response. Signal transduction occurs via phosphotransfer between a sensor protein and a response regulator which interact in tandem. The sensor is usually a transmembrane protein that contains a conserved cytoplasmic histidine kinase transmitter domain and a modular periplasmic sensor domain. The response regulator is cytoplasmic protein that contains a receiver domain that interacts with the histidine kinase, and an output domain that interacts with regulators of transcription or chemotaxis. My work focuses on the X-ray structure determination of a variety of bacterial sensor domains, based on a structural genomics analysis of the entire sensor domain family. Structures of the NarX, DcuS, LisK, and DctB sensor domains have been solved to atomic resolution, some in both ligand-bound and ligand-free states. Two distinct structural folds have been revealed---all-alpha helical and mixed alpha-beta. An analysis of the structures reveals a possible mechanism of transmembrane signaling in histidine kinase sensors as a sliding-piston motion between transmembrane helices. Although there is great diversity in ligand binding, there appears to be a small number of distinct sensor domain folds for which structural representatives of two have been solved. A final synthesis of the structural information with a comprehensive bio-informatics analysis of all histidine kinase sensor domain sequences allows fold prediction for over 400 sensor domains, in a step towards mapping the entire structural landscape of this protein family.

  11. Analysis of Preneoplasia Associated with Progression to Prostatic Cancer

    Science.gov (United States)

    2005-03-01

    genomic DNA array comparative genomic hybridisation (gaCGH). Recurrent chromosome copy number abnormalities (CNAs) where observed in both HPIN and CaP...Nesrallah LJ, Nesrallah A, Bevilacqua RG, Darini E, Carvalho CM, Meirelles MI, Santana I, Camara-Lopes LH. 2001. Abnormal expression of MDM2 in...by statistical normalization and ratio cutoffs of 1.5 and 0.5 for increase/decrease in expression. The meiotic recombination (S. cerevisiae) 11 homolog

  12. A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease

    Science.gov (United States)

    Kyriakou, Theodosios; Nelson, Christopher P; Hopewell, Jemma C; Webb, Thomas R; Zeng, Lingyao; Dehghan, Abbas; Alver, Maris; Armasu, Sebastian M; Auro, Kirsi; Bjonnes, Andrew; Chasman, Daniel I; Chen, Shufeng; Ford, Ian; Franceschini, Nora; Gieger, Christian; Grace, Christopher; Gustafsson, Stefan; Huang, Jie; Hwang, Shih-Jen; Kim, Yun Kyoung; Kleber, Marcus E; Lau, King Wai; Lu, Xiangfeng; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Mihailov, Evelin; Morrison, Alanna C; Pervjakova, Natalia; Qu, Liming; Rose, Lynda M; Salfati, Elias; Saxena, Richa; Scholz, Markus; Smith, Albert V; Tikkanen, Emmi; Uitterlinden, Andre; Yang, Xueli; Zhang, Weihua; Zhao, Wei; de Andrade, Mariza; de Vries, Paul S; van Zuydam, Natalie R; Anand, Sonia S; Bertram, Lars; Beutner, Frank; Dedoussis, George; Frossard, Philippe; Gauguier, Dominique; Goodall, Alison H; Gottesman, Omri; Haber, Marc; Han, Bok-Ghee; Huang, Jianfeng; Jalilzadeh, Shapour; Kessler, Thorsten; König, Inke R; Lannfelt, Lars; Lieb, Wolfgang; Lind, Lars; Lindgren, Cecilia M; Lokki, Marja-Liisa; Magnusson, Patrik K; Mallick, Nadeem H; Mehra, Narinder; Meitinger, Thomas; Memon, Fazal-ur-Rehman; Morris, Andrew P; Nieminen, Markku S; Pedersen, Nancy L; Peters, Annette; Rallidis, Loukianos S; Rasheed, Asif; Samuel, Maria; Shah, Svati H; Sinisalo, Juha; Stirrups, Kathleen E; Trompet, Stella; Wang, Laiyuan; Zaman, Khan S; Ardissino, Diego; Boerwinkle, Eric; Borecki, Ingrid B; Bottinger, Erwin P; Buring, Julie E; Chambers, John C; Collins, Rory; Cupples, L Adrienne; Danesh, John; Demuth, Ilja; Elosua, Roberto; Epstein, Stephen E; Esko, Tõnu; Feitosa, Mary F; Franco, Oscar H; Franzosi, Maria Grazia; Granger, Christopher B; Gu, Dongfeng; Gudnason, Vilmundur; Hall, Alistair S; Hamsten, Anders; Harris, Tamara B; Hazen, Stanley L; Hengstenberg, Christian; Hofman, Albert; Ingelsson, Erik; Iribarren, Carlos; Jukema, J Wouter; Karhunen, Pekka J; Kim, Bong-Jo; Kooner, Jaspal S; Kullo, Iftikhar J; Lehtimäki, Terho; Loos, Ruth J F; Melander, Olle; Metspalu, Andres; März, Winfried; Palmer, Colin N; Perola, Markus; Quertermous, Thomas; Rader, Daniel J; Ridker, Paul M; Ripatti, Samuli; Roberts, Robert; Salomaa, Veikko; Sanghera, Dharambir K; Schwartz, Stephen M; Seedorf, Udo; Stewart, Alexandre F; Stott, David J; Thiery, Joachim; Zalloua, Pierre A; O’Donnell, Christopher J; Reilly, Muredach P; Assimes, Themistocles L; Thompson, John R; Erdmann, Jeanette; Clarke, Robert; Watkins, Hugh; Kathiresan, Sekar; McPherson, Ruth; Deloukas, Panos; Schunkert, Heribert; Samani, Nilesh J; Farrall, Martin

    2015-01-01

    Existing knowledge of genetic variants affecting risk of coronary artery disease (CAD) is largely based on genome-wide association studies (GWAS) analysis of common SNPs. Leveraging phased haplotypes from the 1000 Genomes Project, we report a GWAS meta-analysis of 185 thousand CAD cases and controls, interrogating 6.7 million common (MAF>0.05) as well as 2.7 million low frequency (0.005analysis provides a comprehensive survey of the fine genetic architecture of CAD showing that genetic susceptibility to this common disease is largely determined by common SNPs of small effect size. PMID:26343387

  13. Diffusion tensor analysis of corpus callosum in progressive supranuclear palsy

    Energy Technology Data Exchange (ETDEWEB)

    Ito, Shoichi; Makino, Takahiro; Shirai, Wakako; Hattori, Takamichi [Department of Neurology, Graduate School of Medicine, Chiba University (Japan)

    2008-11-15

    Progressive supranuclear palsy (PSP) is a neurodegenerative disease featuring parkinsonism, supranuclear ophthalmoplegia, dysphagia, and frontal lobe dysfunction. The corpus callosum which consists of many commissure fibers probably reflects cerebral cortical function. Several previous reports showed atrophy or diffusion abnormalities of anterior corpus callosum in PSP patients, but partitioning method used in these studies was based on data obtained in nonhuman primates. In this study, we performed a diffusion tensor analysis using a new partitioning method for the human corpus callosum. Seven consecutive patients with PSP were compared with 29 age-matched patients with Parkinson's Disease (PD) and 19 age-matched healthy control subjects. All subjects underwent diffusion tensor magnetic resonance imaging, and the corpus callosum was partitioned into five areas on the mid-sagittal plane according to a recently established topography of human corpus callosum (CC1-prefrontal area, CC2-premotor and supplementary motor area, CC3-motor area, CC4-sensory area, CC5-parietal, temporal, and occipital area). Fractional anisotropy (FA) and apparent diffusion coefficient (ADC) were measured in each area and differences between groups were analyzed. In the PSP group, FA values were significantly decreased in CC1 and CC2, and ADC values were significantly increased in CC1 and CC2. Receiver operating characteristic analysis showed excellent reliability of FA and ADC analyses of CC1 for differentiating PSP from PD. The anterior corpus callosum corresponding to the prefrontal, premotor, and supplementary motor cortices is affected in PSP patients. This analysis can be an additional test for further confirmation of the diagnosis of PSP.

  14. Multi-omic data integration and analysis using systems genomics approaches

    DEFF Research Database (Denmark)

    Suravajhala, Prashanth; Kogelman, Lisette; Kadarmideen, Haja

    2016-01-01

    In the past years, there has been a remarkable development of high-throughput omics (HTO) technologies such as genomics, epigenomics, transcriptomics, proteomics and metabolomics across all facets of biology. This has spearheaded the progress of the systems biology era, including applications on ...

  15. Multi-omic data integration and analysis using systems genomics approaches: methods and applications in animal production, health and welfare.

    Science.gov (United States)

    Suravajhala, Prashanth; Kogelman, Lisette J A; Kadarmideen, Haja N

    2016-04-29

    In the past years, there has been a remarkable development of high-throughput omics (HTO) technologies such as genomics, epigenomics, transcriptomics, proteomics and metabolomics across all facets of biology. This has spearheaded the progress of the systems biology era, including applications on animal production and health traits. However, notwithstanding these new HTO technologies, there remains an emerging challenge in data analysis. On the one hand, different HTO technologies judged on their own merit are appropriate for the identification of disease-causing genes, biomarkers for prevention and drug targets for the treatment of diseases and for individualized genomic predictions of performance or disease risks. On the other hand, integration of multi-omic data and joint modelling and analyses are very powerful and accurate to understand the systems biology of healthy and sustainable production of animals. We present an overview of current and emerging HTO technologies each with a focus on their applications in animal and veterinary sciences before introducing an integrative systems genomics framework for analysing and integrating multi-omic data towards improved animal production, health and welfare. We conclude that there are big challenges in multi-omic data integration, modelling and systems-level analyses, particularly with the fast emerging HTO technologies. We highlight existing and emerging systems genomics approaches and discuss how they contribute to our understanding of the biology of complex traits or diseases and holistic improvement of production performance, disease resistance and welfare.

  16. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    Science.gov (United States)

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden.

  17. Organization and comparative analysis of the mitochondrial genomes of bioluminescent Elateroidea (Coleoptera: Polyphaga).

    Science.gov (United States)

    Amaral, Danilo T; Mitani, Yasuo; Ohmiya, Yoshihiro; Viviani, Vadim R

    2016-07-25

    Mitochondrial genome organization in the Elateroidea superfamily (Coleoptera), which include the main families of bioluminescent beetles, has been poorly studied and lacking information about Phengodidae family. We sequenced the mitochondrial genomes of Neotropical Lampyridae (Bicellonycha lividipennis), Phengodidae (Brasilocerus sp.2 and Phrixothrix hirtus) and Elateridae (Pyrearinus termitilluminans, Hapsodrilus ignifer and Teslasena femoralis). All species had a typical insect mitochondrial genome except for the following: in the elaterid T. femoralis genome there is a non-coding region between NADH2 and tRNA-Trp; in the phengodids Brasilocerus sp.2 and P. hirtus genomes we did not find the tRNA-Ile and tRNA-Gln. The P. hirtus genome showed a ~1.6kb non-coding region, the rearrangement of tRNA-Tyr, a new tRNA-Leu copy, and several regions with higher AT contents. Phylogenetics analysis using Bayesian and ML models indicated that the Phengodidae+Rhagophthalmidae are closely related to Lampyridae family, and included Drilus flavescens (Drilidae) as an internal clade within Elateridae. This is the first report that compares the mitochondrial genomes organization of the three main families of bioluminescent Elateroidea, including the first Neotropical Lampyridae and Phengodidae. The losses of tRNAs, and translocation and duplication events found in Phengodidae mt genomes, mainly in P. hirtus, may indicate different evolutionary rates in these mitochondrial genomes. The mitophylogenomics analysis indicates the monophyly of the three bioluminescent families and a closer relationship between Lampyridae and Phengodidae/Rhagophthalmidae, in contrast with previous molecular analysis.

  18. A Portfolio Analysis Tool for Measuring NASAs Aeronautics Research Progress toward Planned Strategic Outcomes

    Science.gov (United States)

    Tahmasebi, Farhad; Pearce, Robert

    2016-01-01

    Description of a tool for portfolio analysis of NASA's Aeronautics research progress toward planned community strategic Outcomes is presented. The strategic planning process for determining the community Outcomes is also briefly described. Stakeholder buy-in, partnership performance, progress of supporting Technical Challenges, and enablement forecast are used as the criteria for evaluating progress toward Outcomes. A few illustrative examples are also presented.

  19. Genome-wide analysis of TCP family in tobacco.

    Science.gov (United States)

    Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H

    2016-05-23

    The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco.

  20. Comparative Analysis of Fatty Acid Desaturases in Cyanobacterial Genomes

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    2008-01-01

    Full Text Available Fatty acid desaturases are enzymes that introduce double bonds into the hydrocarbon chains of fatty acids. The fatty acid desaturases from 37 cyanobacterial genomes were identified and classified based upon their conserved histidine-rich motifs and phylogenetic analysis, which help to determine the amounts and distributions of desaturases in cyanobacterial species. The filamentous or N2-fixing cyanobacteria usually possess more types of fatty acid desaturases than that of unicellular species. The pathway of acyl-lipid desaturation for unicellular marine cyanobacteria Synechococcus and Prochlorococcus differs from that of other cyanobacteria, indicating different phylogenetic histories of the two genera from other cyanobacteria isolated from freshwater, soil, or symbiont. Strain Gloeobacter violaceus PCC 7421 was isolated from calcareous rock and lacks thylakoid membranes. The types and amounts of desaturases of this strain are distinct to those of other cyanobacteria, reflecting the earliest divergence of it from the cyanobacterial line. Three thermophilic unicellular strains, Thermosynechococcus elongatus BP-1 and two Synechococcus Yellowstone species, lack highly unsaturated fatty acids in lipids and contain only one Δ9 desaturase in contrast with mesophilic strains, which is probably due to their thermic habitats. Thus, the amounts and types of fatty acid desaturases are various among different cyanobacterial species, which may result from the adaption to environments in evolution.

  1. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    NARCIS (Netherlands)

    Gil, R.; Silva, F.J.; Zientz, E.; Delmotte, F.; Gonzalez-Candelas, F.; Latorre, A.; Rausell, C.; Kamerbeek, J.; Gadau, J.; Hölldobler, B.; Ham, van R.C.H.J.; Gross, R.; Moya, A.

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely

  2. Meta-analysis of genome-wide association from genomic prediction models

    Science.gov (United States)

    A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. To increase sample size, results from different GWA can be combined in a meta-analys...

  3. caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data

    Directory of Open Access Journals (Sweden)

    Xuan Jianhua

    2008-09-01

    Full Text Available Abstract Background The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables. Results In an effort to partially address these limitations, we develop the VIsual Statistical Data Analyzer (VISDA for cluster modeling, visualization, and discovery in genomic data. VISDA performs progressive, coarse-to-fine (divisive hierarchical clustering and visualization, supported by hierarchical mixture modeling, supervised/unsupervised informative gene selection, supervised/unsupervised data visualization, and user/prior knowledge guidance, to discover hidden clusters within complex, high-dimensional genomic data. The hierarchical visualization and clustering scheme of VISDA uses multiple local visualization subspaces (one at each node of the hierarchy and consequent subspace data modeling to reveal both global and local cluster structures in a "divide and conquer" scenario. Multiple projection methods, each sensitive to a distinct type of clustering tendency, are used for data visualization, which increases the likelihood that cluster structures of interest are revealed. Initialization of the full dimensional model is based on first learning models with user/prior knowledge guidance on data projected into the low-dimensional visualization spaces. Model order selection for the high dimensional data is accomplished by Bayesian theoretic criteria and user justification applied via the hierarchy of low-dimensional visualization subspaces. Based on its complementary building blocks and flexible functionality, VISDA is generally applicable for gene clustering, sample

  4. caBIG VISDA: modeling, visualization, and discovery for cluster analysis of genomic data.

    Science.gov (United States)

    Zhu, Yitan; Li, Huai; Miller, David J; Wang, Zuyi; Xuan, Jianhua; Clarke, Robert; Hoffman, Eric P; Wang, Yue

    2008-09-18

    The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables. In an effort to partially address these limitations, we develop the VIsual Statistical Data Analyzer (VISDA) for cluster modeling, visualization, and discovery in genomic data. VISDA performs progressive, coarse-to-fine (divisive) hierarchical clustering and visualization, supported by hierarchical mixture modeling, supervised/unsupervised informative gene selection, supervised/unsupervised data visualization, and user/prior knowledge guidance, to discover hidden clusters within complex, high-dimensional genomic data. The hierarchical visualization and clustering scheme of VISDA uses multiple local visualization subspaces (one at each node of the hierarchy) and consequent subspace data modeling to reveal both global and local cluster structures in a "divide and conquer" scenario. Multiple projection methods, each sensitive to a distinct type of clustering tendency, are used for data visualization, which increases the likelihood that cluster structures of interest are revealed. Initialization of the full dimensional model is based on first learning models with user/prior knowledge guidance on data projected into the low-dimensional visualization spaces. Model order selection for the high dimensional data is accomplished by Bayesian theoretic criteria and user justification applied via the hierarchy of low-dimensional visualization subspaces. Based on its complementary building blocks and flexible functionality, VISDA is generally applicable for gene clustering, sample clustering, and phenotype clustering

  5. Genomic analysis of the basal lineage fungus Rhizopus oryzae reveals a whole-genome duplication.

    Directory of Open Access Journals (Sweden)

    Li-Jun Ma

    2009-07-01

    Full Text Available Rhizopus oryzae is the primary cause of mucormycosis, an emerging, life-threatening infection characterized by rapid angioinvasive growth with an overall mortality rate that exceeds 50%. As a representative of the paraphyletic basal group of the fungal kingdom called "zygomycetes," R. oryzae is also used as a model to study fungal evolution. Here we report the genome sequence of R. oryzae strain 99-880, isolated from a fatal case of mucormycosis. The highly repetitive 45.3 Mb genome assembly contains abundant transposable elements (TEs, comprising approximately 20% of the genome. We predicted 13,895 protein-coding genes not overlapping TEs, many of which are paralogous gene pairs. The order and genomic arrangement of the duplicated gene pairs and their common phylogenetic origin provide evidence for an ancestral whole-genome duplication (WGD event. The WGD resulted in the duplication of nearly all subunits of the protein complexes associated with respiratory electron transport chains, the V-ATPase, and the ubiquitin-proteasome systems. The WGD, together with recent gene duplications, resulted in the expansion of multiple gene families related to cell growth and signal transduction, as well as secreted aspartic protease and subtilase protein families, which are known fungal virulence factors. The duplication of the ergosterol biosynthetic pathway, especially the major azole target, lanosterol 14alpha-demethylase (ERG11, could contribute to the variable responses of R. oryzae to different azole drugs, including voriconazole and posaconazole. Expanded families of cell-wall synthesis enzymes, essential for fungal cell integrity but absent in mammalian hosts, reveal potential targets for novel and R. oryzae-specific diagnostic and therapeutic treatments.

  6. Learning Progressions and Teaching Sequences: A Review and Analysis

    Science.gov (United States)

    Duschl, Richard; Maeng, Seungho; Sezen, Asli

    2011-01-01

    Our paper is an analytical review of the design, development and reporting of learning progressions and teaching sequences. Research questions are: (1) what criteria are being used to propose a "hypothetical learning progression/trajectory" and (2) what measurements/evidence are being used to empirically define and refine a "hypothetical learning…

  7. The tumor suppressor SirT2 regulates cell cycle progression and genome stability by modulating the mitotic deposition of H4K20 methylation

    Science.gov (United States)

    The establishment of the epigenetic mark H4K20me1 (monomethylation of H4K20) by PR-Set7 during G2/M directly impacts S-phase progression and genome stability. However, the mechanisms involved in the regulation of this event are not well understood. Here we show that SirT2 regulates H4K20me1 depositi...

  8. Connecting Genomic Alterations to Cancer Biology with Proteomics: The NCI Clinical Proteomic Tumor Analysis Consortium

    Energy Technology Data Exchange (ETDEWEB)

    Ellis, Matthew; Gillette, Michael; Carr, Steven A.; Paulovich, Amanda G.; Smith, Richard D.; Rodland, Karin D.; Townsend, Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel

    2013-10-03

    The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verifi cation using targeted mass spectrometry methods.

  9. Genome-wide Association Analysis of Kernel Weight in Hard Winter Wheat

    Science.gov (United States)

    Wheat kernel weight is an important and heritable component of wheat grain yield and a key predictor of flour extraction. Genome-wide association analysis was conducted to identify genomic regions associated with kernel weight and kernel weight environmental response in 8 trials of 299 hard winter ...

  10. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    Science.gov (United States)

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  11. BGI-RIS: an integrated information resource and comparative analysis workbench for rice genomics

    DEFF Research Database (Denmark)

    Zhao, Wenming; Wang, Jing; He, Ximiao

    2004-01-01

    the application of the rice genomic information and to provide a foundation for functional and evolutionary studies of other important cereal crops, we implemented our Rice Information System (BGI-RIS), the most up-to-date integrated information resource as well as a workbench for comparative genomic analysis...

  12. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

    DEFF Research Database (Denmark)

    Minster, Ryan L; Sanders, Jason L; Singh, Jatinder;

    2015-01-01

    BACKGROUND: The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. METHODS: We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted...

  13. Genomic analysis of a nontoxigenic, invasive Corynebacterium diphtheriae strain from Brazil

    Directory of Open Access Journals (Sweden)

    Fernando Encinas

    2015-09-01

    Full Text Available We report the complete genome sequence and analysis of an invasive Corynebacterium diphtheriae strain that caused endocarditis in Rio de Janeiro, Brazil. It was selected for sequencing on the basis of the current relevance of nontoxigenic strains for public health. The genomic information was explored in the context of diversity, plasticity and genetic relatedness with other contemporary strains.

  14. Genomic analysis of a nontoxigenic, invasive Corynebacterium diphtheriae strain from Brazil.

    Science.gov (United States)

    Encinas, Fernando; Marin, Michel A; Ramos, Juliana N; Vieira, Verônica V; Mattos-Guaraldi, Ana Luiza; Vicente, Ana Carolina P

    2015-09-01

    We report the complete genome sequence and analysis of an invasive Corynebacterium diphtheriae strain that caused endocarditis in Rio de Janeiro, Brazil. It was selected for sequencing on the basis of the current relevance of nontoxigenic strains for public health. The genomic information was explored in the context of diversity, plasticity and genetic relatedness with other contemporary strains.

  15. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    Science.gov (United States)

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  16. Dissection of genomic correlation matrices of US Holsteins using multivariate factor analysis

    Science.gov (United States)

    Aim of the study was to compare correlation matrices between direct genomic predictions for 31 production, fitness and conformation traits both at genomic and chromosomal level in US Holstein bulls. Multivariate factor analysis was used to quantify basic features of correlation matrices. Factor extr...

  17. Carotenoid biosynthetic genes in Brassica rapa: comparative genomic analysis, phylogenetic analysis, and expression profiling

    OpenAIRE

    Li, Peirong; Zhang, Shujiang; Zhang, Shifan; Li, Fei; Zhang, Hui; Cheng, Feng; Wu, Jian; Wang, Xiaowu; Sun, Rifei

    2015-01-01

    Background Carotenoids are isoprenoid compounds synthesized by all photosynthetic organisms. Despite much research on carotenoid biosynthesis in the model plant Arabidopsis thaliana, there is a lack of information on the carotenoid pathway in Brassica rapa. To better understand its carotenoid biosynthetic pathway, we performed a systematic analysis of carotenoid biosynthetic genes at the genome level in B. rapa. Results We identified 67 carotenoid biosynthetic genes in B. rapa, which were ort...

  18. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    Directory of Open Access Journals (Sweden)

    Fowler Katie E

    2009-08-01

    Full Text Available Abstract Background The availability of the complete chicken (Gallus gallus genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo and the first analysis of copy number variants (CNVs in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos, an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots". Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies.

  19. Genome-wide analysis of intraspecific DNA polymorphism in 'Micro-Tom', a model cultivar of tomato (Solanum lycopersicum).

    Science.gov (United States)

    Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh

    2014-02-01

    Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.

  20. Molecular Heterogeneity in Primary Breast Carcinomas and Axillary Lymph Node Metastases Assessed by Genomic Fingerprinting Analysis

    Science.gov (United States)

    Ellsworth, Rachel E; Toro, Allyson L; Blackburn, Heather L; Decewicz, Alisha; Deyarmin, Brenda; Mamula, Kimberly A; Costantino, Nicholas S; Hooke, Jeffrey A; Shriver, Craig D; Ellsworth, Darrell L

    2015-01-01

    Molecular heterogeneity within primary breast carcinomas and among axillary lymph node (LN) metastases may impact diagnosis and confound treatment. In this study, we used short tandem repeated sequences to assess genomic heterogeneity and to determine hereditary relationships among primary tumor areas and regional metastases from 30 breast cancer patients. We found that primary carcinomas were genetically heterogeneous and sampling multiple areas was necessary to adequately assess genomic variability. LN metastases appeared to originate at different time periods during disease progression from different sites of the primary tumor and the extent of genomic divergence among regional metastases was associated with a less favorable patient outcome (P = 0.009). In conclusion, metastasis is a complex process influenced by primary tumor heterogeneity and variability in the timing of dissemination. Genomic variation in primary breast tumors and regional metastases may negatively impact clinical diagnostics and contribute to therapeutic resistance. PMID:26279627

  1. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  2. Construction of a genome-wide human BAC-Unigene resource. Final progress report, 1989--1996

    Energy Technology Data Exchange (ETDEWEB)

    Lim, C.S.; Xu, R.X.; Wang, M. [and others

    1996-12-31

    Currently, over 30,000 mapped STSs and 27,000 mapped Unigenes (non-redundant, unigene sets of cDNA representing EST clusters) are available for human alone. A total of 44,000 Unigene cDNA clones have been supplied by Research Genetics. Unigenes, or cDNAs are excellent resource for map building for two reasons. Firstly, they exist in two alternative forms -- as both sequence information for PCR primer pairs, and cDNA clones -- thus making library screening by colony hybridization as well as pooled library PCR possible. The authors have developed an efficient and robust procedure to screen genomic libraries with large number of DNA probes. Secondly, the linkage and order of expressed sequences, or genes are highly conserved among human, mouse and other mammalian species. Therefore, mapping with cDNA markers rather than random anonymous STSs will greatly facilitate comparative, evolutionary studies as well as physical map building. They have currently deconvoluted over 10,000 Unigene probes against a 4X coverage human BAC clones from the approved library D by high density colony hybridization method. 10,000 batches of Unigenes are arrayed in an imaginary 100 X 100 matrix from which 100 row pools and 100 column pools are obtained. Library filters are hybridized with pooled probes, thus reducing the number of hybridization required for addressing the positives for each Unigene from 10,000 to 200. Details on the experimental scheme as well as daily progress report is posted on the Web site (http://www.tree.caltech.edu).

  3. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  4. Comparative bacterial proteomics: analysis of the core genome concept.

    Directory of Open Access Journals (Sweden)

    Stephen J Callister

    Full Text Available While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  5. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    Energy Technology Data Exchange (ETDEWEB)

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  6. Comparative genomics and phylogenetic analysis of S. dysenteriae subgroup

    Institute of Scientific and Technical Information of China (English)

    YANG; E; BIN; Wen; PENG; Junping; ZHANG; Xiaobing; WANG; Ji

    2005-01-01

    Genomic compositions of representatives of thirteen S. Dysenteriae serotypes were investigated by performing comparative genomic hybridization (CGH) with microarray containing the whole genomic ORFs (open reading frames, ORFs) of E. Coli K12 strain MG1655 and specific ORFs of S. Dysenteriae A1 strain Sd51197. The CGH results indicated the genomes of the serotypes contain 2654 conserved ORFs originating from E. Coli. However, 219 intrinsic genes of E. Coli including those prophage genes, molecular chaperones, synthesis of specific O antigen and so on were absent. Moreover, some specific genes such as type II secretion system associated components, iron transport related genes and some others as well were acquired through horizontal transfer. According to phylogenic trees based on genetic composition, it was demonstrated that A1, A2, A8, A10 were distinct from the other S. Dysenteriae serotypes. Our results in this report may provide new insights into the physiological process, pathogenicity and evolution of S. Dysenteriae.

  7. Integrated proteomic and genomic analysis of colorectal cancer

    Science.gov (United States)

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  8. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    Science.gov (United States)

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  9. Determining protein function and interaction from genome analysis

    Science.gov (United States)

    Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

    2004-08-03

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  10. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    Science.gov (United States)

    Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-01-01

    While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490

  11. An Alternative Methodological Approach for Cost-Effectiveness Analysis and Decision Making in Genomic Medicine.

    Science.gov (United States)

    Fragoulakis, Vasilios; Mitropoulou, Christina; van Schaik, Ron H; Maniadakis, Nikolaos; Patrinos, George P

    2016-05-01

    Genomic Medicine aims to improve therapeutic interventions and diagnostics, the quality of life of patients, but also to rationalize healthcare costs. To reach this goal, careful assessment and identification of evidence gaps for public health genomics priorities are required so that a more efficient healthcare environment is created. Here, we propose a public health genomics-driven approach to adjust the classical healthcare decision making process with an alternative methodological approach of cost-effectiveness analysis, which is particularly helpful for genomic medicine interventions. By combining classical cost-effectiveness analysis with budget constraints, social preferences, and patient ethics, we demonstrate the application of this model, the Genome Economics Model (GEM), based on a previously reported genome-guided intervention from a developing country environment. The model and the attendant rationale provide a practical guide by which all major healthcare stakeholders could ensure the sustainability of funding for genome-guided interventions, their adoption and coverage by health insurance funds, and prioritization of Genomic Medicine research, development, and innovation, given the restriction of budgets, particularly in developing countries and low-income healthcare settings in developed countries. The implications of the GEM for the policy makers interested in Genomic Medicine and new health technology and innovation assessment are also discussed.

  12. CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb

    Directory of Open Access Journals (Sweden)

    Mahadevan Padmanabhan

    2009-08-01

    Full Text Available Abstract Background Viruses and small-genome bacteria (~2 megabases and smaller comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. Findings CGUG is available at http://binf.gmu.edu/geneorder.html as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. Conclusion CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins.

  13. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  14. Network Based Prediction Model for Genomics Data Analysis*

    OpenAIRE

    Huang, Ying; Wang, Pei

    2012-01-01

    Biological networks, such as genetic regulatory networks and protein interaction networks, provide important information for studying gene/protein activities. In this paper, we propose a new method, NetBoosting, for incorporating a priori biological network information in analyzing high dimensional genomics data. Specially, we are interested in constructing prediction models for disease phenotypes of interest based on genomics data, and at the same time identifying disease susceptible genes. ...

  15. Genome analysis of E. coli isolated from Crohn's disease patients.

    Science.gov (United States)

    Rakitina, Daria V; Manolov, Alexander I; Kanygina, Alexandra V; Garushyants, Sofya K; Baikova, Julia P; Alexeev, Dmitry G; Ladygina, Valentina G; Kostryukova, Elena S; Larin, Andrei K; Semashko, Tatiana A; Karpova, Irina Y; Babenko, Vladislav V; Ismagilova, Ruzilya K; Malanin, Sergei Y; Gelfand, Mikhail S; Ilina, Elena N; Gorodnichev, Roman B; Lisitsyna, Eugenia S; Aleshkin, Gennady I; Scherbakov, Petr L; Khalif, Igor L; Shapina, Marina V; Maev, Igor V; Andreev, Dmitry N; Govorun, Vadim M

    2017-07-19

    Escherichia coli (E. coli) has been increasingly implicated in the pathogenesis of Crohn's disease (CD). The phylogeny of E. coli isolated from Crohn's disease patients (CDEC) was controversial, and while genotyping results suggested heterogeneity, the sequenced strains of E. coli from CD patients were closely related. We performed the shotgun genome sequencing of 28 E. coli isolates from ten CD patients and compared genomes from these isolates with already published genomes of CD strains and other pathogenic and non-pathogenic strains. CDEC was shown to belong to A, B1, B2 and D phylogenetic groups. The plasmid and several operons from the reference CD-associated E. coli strain LF82 were demonstrated to be more often present in CDEC genomes belonging to different phylogenetic groups than in genomes of commensal strains. The operons include carbon-source induced invasion GimA island, prophage I, iron uptake operons I and II, capsular assembly pathogenetic island IV and propanediol and galactitol utilization operons. Our findings suggest that CDEC are phylogenetically diverse. However, some strains isolated from independent sources possess highly similar chromosome or plasmids. Though no CD-specific genes or functional domains were present in all CD-associated strains, some genes and operons are more often found in the genomes of CDEC than in commensal E. coli. They are principally linked to gut colonization and utilization of propanediol and other sugar alcohols.

  16. Analysis of Human Accelerated DNA Regions Using Archaic Hominin Genomes

    Science.gov (United States)

    Burbano, Hernán A.; Green, Richard E.; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S.; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  17. Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor; Kohler, Annegret; Martin, Francis

    2013-03-08

    Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012 alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.

  18. Analysis of high-identity segmental duplications in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Carelli Francesco N

    2011-08-01

    Full Text Available Abstract Background Segmental duplications (SDs are blocks of genomic sequence of 1-200 kb that map to different loci in a genome and share a sequence identity > 90%. SDs show at the sequence level the same characteristics as other regions of the human genome: they contain both high-copy repeats and gene sequences. SDs play an important role in genome plasticity by creating new genes and modeling genome structure. Although data is plentiful for mammals, not much was known about the representation of SDs in plant genomes. In this regard, we performed a genome-wide analysis of high-identity SDs on the sequenced grapevine (Vitis vinifera genome (PN40024. Results We demonstrate that recent SDs (> 94% identity and >= 10 kb in size are a relevant component of the grapevine genome (85 Mb, 17% of the genome sequence. We detected mitochondrial and plastid DNA and genes (10% of gene annotation in segmentally duplicated regions of the nuclear genome. In particular, the nine highest copy number genes have a copy in either or both organelle genomes. Further we showed that several duplicated genes take part in the biosynthesis of compounds involved in plant response to environmental stress. Conclusions These data show the great influence of SDs and organelle DNA transfers in modeling the Vitis vinifera nuclear DNA structure as well as the impact of SDs in contributing to the adaptive capacity of grapevine and the nutritional content of grape products through genome variation. This study represents a step forward in the full characterization of duplicated genes important for grapevine cultural needs and human health.

  19. [Phylogenetic relationships and intraspecific variation of D-genome Aegilops L. as revealed by RAPD analysis].

    Science.gov (United States)

    Goriunova, S V; Kochieva, E Z; Chikida, N N; Pukhal'skiĭ, V A

    2004-05-01

    RAPD analysis was carried out to study the genetic variation and phylogenetic relationships of polyploid Aegilops species, which contain the D genome as a component of the alloploid genome, and diploid Aegilops tauschii, which is a putative donor of the D genome for common wheat. In total, 74 accessions of six D-genome Aegilops species were examined. The highest intraspecific variation (0.03-0.21) was observed for Ae. tauschii. Intraspecific distances between accessions ranged 0.007-0.067 in Ae. cylindrica, 0.017-0.047 in Ae. vavilovii, and 0.00-0.053 in Ae. juvenalis. Likewise, Ae. ventricosa and Ae. crassa showed low intraspecific polymorphism. The among-accession difference in alloploid Ae. ventricosa (genome DvNv) was similar to that of one parental species, Ae. uniaristata (N), and substantially lower than in the other parent, Ae. tauschii (D). The among-accession difference in Ae. cylindrica (CcDc) was considerably lower than in either parent, Ae. tauschii (D) or Ae. caudata (C). With the exception of Ae. cylindrica, all D-genome species--Ae. tauschii (D), Ae. ventricosa (DvNv), Ae. crassa (XcrDcrl and XcrDcrlDcr2), Ae. juvenalis (XjDjUj), and Ae. vavilovii (XvaDvaSva)--formed a single polymorphic cluster, which was distinct from clusters of other species. The only exception, Ae. cylindrica, did not group with the other D-genome species, but clustered with Ae. caudata (C), a donor of the C genome. The cluster of these two species was clearly distinct from the cluster of the other D-genome species and close to a cluster of Ae. umbellulata (genome U) and Ae. ovata (genome UgMg). Thus, RAPD analysis for the first time was used to estimate and to compare the interpopulation polymorphism and to establish the phylogenetic relationships of all diploid and alloploid D-genome Aegilops species.

  20. Genome Sequencing and Comparative Genomics Analysis Revealed Pathogenic Potential in Penicillium capsulatum as a Novel Fungal Pathogen Belonging to Eurotiales

    Science.gov (United States)

    Yang, Ying; Chen, Min; Li, Zongwei; Al-Hatmi, Abdullah M. S.; de Hoog, Sybren; Pan, Weihua; Ye, Qiang; Bo, Xiaochen; Li, Zhen; Wang, Shengqi; Wang, Junzhi; Chen, Huipeng; Liao, Wanqing

    2016-01-01

    Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptomes of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNPs in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen. PMID:27761131

  1. Progress on retinal image analysis for age related macular degeneration.

    Science.gov (United States)

    Kanagasingam, Yogesan; Bhuiyan, Alauddin; Abràmoff, Michael D; Smith, R Theodore; Goldschmidt, Leonard; Wong, Tien Y

    2014-01-01

    Age-related macular degeneration (AMD) is the leading cause of vision loss in those over the age of 50 years in the developed countries. The number is expected to increase by ∼1.5 fold over the next ten years due to an increase in aging population. One of the main measures of AMD severity is the analysis of drusen, pigmentary abnormalities, geographic atrophy (GA) and choroidal neovascularization (CNV) from imaging based on color fundus photograph, optical coherence tomography (OCT) and other imaging modalities. Each of these imaging modalities has strengths and weaknesses for extracting individual AMD pathology and different imaging techniques are used in combination for capturing and/or quantification of different pathologies. Current dry AMD treatments cannot cure or reverse vision loss. However, the Age-Related Eye Disease Study (AREDS) showed that specific anti-oxidant vitamin supplementation reduces the risk of progression from intermediate stages (defined as the presence of either many medium-sized drusen or one or more large drusen) to late AMD which allows for preventative strategies in properly identified patients. Thus identification of people with early stage AMD is important to design and implement preventative strategies for late AMD, and determine their cost-effectiveness. A mass screening facility with teleophthalmology or telemedicine in combination with computer-aided analysis for large rural-based communities may identify more individuals suitable for early stage AMD prevention. In this review, we discuss different imaging modalities that are currently being considered or used for screening AMD. In addition, we look into various automated and semi-automated computer-aided grading systems and related retinal image analysis techniques for drusen, geographic atrophy and choroidal neovascularization detection and/or quantification for measurement of AMD severity using these imaging modalities. We also review the existing telemedicine studies which

  2. Genome-wide analysis of the homeobox C6 transcriptional network in prostate cancer.

    Science.gov (United States)

    McCabe, Colleen D; Spyropoulos, Demetri D; Martin, David; Moreno, Carlos S

    2008-03-15

    Homeobox transcription factors are developmentally regulated genes that play crucial roles in tissue patterning. Homeobox C6 (HOXC6) is overexpressed in prostate cancers and correlated with cancer progression, but the downstream targets of HOXC6 are largely unknown. We have performed genome-wide localization analysis to identify promoters bound by HOXC6 in prostate cancer cells. This analysis identified 468 reproducibly bound promoters whose associated genes are involved in functions such as cell proliferation and apoptosis. We have complemented these data with expression profiling of prostates from mice with homozygous disruption of the Hoxc6 gene to identify 31 direct regulatory target genes of HOXC6. We show that HOXC6 directly regulates expression of bone morphogenic protein 7, fibroblast growth factor receptor 2, insulin-like growth factor binding protein 3, and platelet-derived growth factor receptor alpha (PDGFRA) in prostate cells and indirectly influences the Notch and Wnt signaling pathways in vivo. We further show that inhibition of PDGFRA reduces proliferation of prostate cancer cells, and that overexpression of HOXC6 can overcome the effects of PDGFRA inhibition. HOXC6 regulates genes with both oncogenic and tumor suppressor activities as well as several genes such as CD44 that are important for prostate branching morphogenesis and metastasis to the bone microenvironment.

  3. Genomic resources for sea lice: analysis of ESTs and mitochondrial genomes.

    Science.gov (United States)

    Yasuike, Motoshige; Leong, Jong; Jantzen, Stuart G; von Schalburg, Kristian R; Nilsen, Frank; Jones, Simon R M; Koop, Ben F

    2012-04-01

    Sea lice are common parasites of both farmed and wild salmon. Salmon farming constitutes an important economic market in North America, South America, and Northern Europe. Infections with sea lice can result in significant production losses. A compilation of genomic information on different genera of sea lice is an important resource for understanding their biology as well as for the study of population genetics and control strategies. We report on over 150,000 expressed sequence tags (ESTs) from five different species (Pacific Lepeophtheirus salmonis (49,672 new ESTs in addition to 14,994 previously reported ESTs), Atlantic L. salmonis (57,349 ESTs), Caligus clemensi (14,821 ESTs), Caligus rogercresseyi (32,135 ESTs), and Lernaeocera branchialis (16,441 ESTs)). For each species, ESTs were assembled into complete or partial genes and annotated by comparisons to known proteins in public databases. In addition, whole mitochondrial (mt) genome sequences of C. clemensi (13,440 bp) and C. rogercresseyi (13,468 bp) were determined and compared to L. salmonis. Both nuclear and mtDNA genes show very high levels of sequence divergence between these ectoparastic copepods suggesting that the different species of sea lice have been in existence for 37-113 million years and that parasitic association with salmonids is also quite ancient. Our ESTs and mtDNA data provide a novel resource for the study of sea louse biology, population genetics, and control strategies. This genomic information provides the material basis for the development of a 38K sea louse microarray that can be used in conjunction with our existing 44K salmon microarray to study host-parasite interactions at the molecular level. This report represents the largest genomic resource for any copepod species to date.

  4. Genome sequence of Cronobacter sakazakii BAA-894 and comparative genomic hybridization analysis with other Cronobacter species.

    Directory of Open Access Journals (Sweden)

    Eva Kucerova

    Full Text Available BACKGROUND: The genus Cronobacter (formerly called Enterobacter sakazakii is composed of five species; C. sakazakii, C. malonaticus, C. turicensis, C. muytjensii, and C. dublinensis. The genus includes opportunistic human pathogens, and the first three species have been associated with neonatal infections. The most severe diseases are caused in neonates and include fatal necrotizing enterocolitis and meningitis. The genetic basis of the diversity within the genus is unknown, and few virulence traits have been identified. METHODOLOGY/PRINCIPAL FINDINGS: We report here the first sequence of a member of this genus, C. sakazakii strain BAA-894. The genome of Cronobacter sakazakii strain BAA-894 comprises a 4.4 Mb chromosome (57% GC content and two plasmids; 31 kb (51% GC and 131 kb (56% GC. The genome was used to construct a 387,000 probe oligonucleotide tiling DNA microarray covering the whole genome. Comparative genomic hybridization (CGH was undertaken on five other C. sakazakii strains, and representatives of the four other Cronobacter species. Among 4,382 annotated genes inspected in this study, about 55% of genes were common to all C. sakazakii strains and 43% were common to all Cronobacter strains, with 10-17% absence of genes. CONCLUSIONS/SIGNIFICANCE: CGH highlighted 15 clusters of genes in C. sakazakii BAA-894 that were divergent or absent in more than half of the tested strains; six of these are of probable prophage origin. Putative virulence factors were identified in these prophage and in other variable regions. A number of genes unique to Cronobacter species associated with neonatal infections (C. sakazakii, C. malonaticus and C. turicensis were identified. These included a copper and silver resistance system known to be linked to invasion of the blood-brain barrier by neonatal meningitic strains of Escherichia coli. In addition, genes encoding for multidrug efflux pumps and adhesins were identified that were unique to C. sakazakii

  5. Analysis of the genome content of Lactococcus garvieae by genomic interspecies microarray hybridization

    Directory of Open Access Journals (Sweden)

    Gibello Alicia

    2010-03-01

    Full Text Available Abstract Background Lactococcus garvieae is a bacterial pathogen that affects different animal species in addition to humans. Despite the widespread distribution and emerging clinical significance of L. garvieae in both veterinary and human medicine, there is almost a complete lack of knowledge about the genetic content of this microorganism. In the present study, the genomic content of L. garvieae CECT 4531 was analysed using bioinformatics tools and microarray-based comparative genomic hybridization (CGH experiments. Lactococcus lactis subsp. lactis IL1403 and Streptococcus pneumoniae TIGR4 were used as reference microorganisms. Results The combination and integration of in silico analyses and in vitro CGH experiments, performed in comparison with the reference microorganisms, allowed establishment of an inter-species hybridization framework with a detection threshold based on a sequence similarity of ≥ 70%. With this threshold value, 267 genes were identified as having an analogue in L. garvieae, most of which (n = 258 have been documented for the first time in this pathogen. Most of the genes are related to ribosomal, sugar metabolism or energy conversion systems. Some of the identified genes, such as als and mycA, could be involved in the pathogenesis of L. garvieae infections. Conclusions In this study, we identified 267 genes that were potentially present in L. garvieae CECT 4531. Some of the identified genes could be involved in the pathogenesis of L. garvieae infections. These results provide the first insight into the genome content of L. garvieae.

  6. Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value

    Directory of Open Access Journals (Sweden)

    Donghyun Shin

    2017-03-01

    Full Text Available Objective Holsteins are known as the world’s highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein using Korean Holstein data. Methods This study was performed using single nucleotide polymorphism (SNP chip data (Illumina BovineSNP50 Beadchip of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins.

  7. Symbolic flux analysis for genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Peterson Pearu

    2011-05-01

    Full Text Available Abstract Background With the advent of genomic technology, the size of metabolic networks that are subject to analysis is growing. A common task when analyzing metabolic networks is to find all possible steady state regimes. There are several technical issues that have to be addressed when analyzing large metabolic networks including accumulation of numerical errors and presentation of the solution to the researcher. One way to resolve those technical issues is to analyze the network using symbolic methods. The aim of this paper is to develop a routine that symbolically finds the steady state solutions of large metabolic networks. Results A symbolic Gauss-Jordan elimination routine was developed for analyzing large metabolic networks. This routine was tested by finding the steady state solutions for a number of curated stoichiometric matrices with the largest having about 4000 reactions. The routine was able to find the solution with a computational time similar to the time used by a numerical singular value decomposition routine. As an advantage of symbolic solution, a set of independent fluxes can be suggested by the researcher leading to the formation of a desired flux basis describing the steady state solution of the network. These independent fluxes can be constrained using experimental data. We demonstrate the application of constraints by calculating a flux distribution for the central metabolic and amino acid biosynthesis pathways of yeast. Conclusions We were able to find symbolic solutions for the steady state flux distribution of large metabolic networks. The ability to choose a flux basis was found to be useful in the constraint process and provides a strong argument for using symbolic Gauss-Jordan elimination in place of singular value decomposition.

  8. Symbolic flux analysis for genome-scale metabolic networks.

    Science.gov (United States)

    Schryer, David W; Vendelin, Marko; Peterson, Pearu

    2011-05-23

    With the advent of genomic technology, the size of metabolic networks that are subject to analysis is growing. A common task when analyzing metabolic networks is to find all possible steady state regimes. There are several technical issues that have to be addressed when analyzing large metabolic networks including accumulation of numerical errors and presentation of the solution to the researcher. One way to resolve those technical issues is to analyze the network using symbolic methods. The aim of this paper is to develop a routine that symbolically finds the steady state solutions of large metabolic networks. A symbolic Gauss-Jordan elimination routine was developed for analyzing large metabolic networks. This routine was tested by finding the steady state solutions for a number of curated stoichiometric matrices with the largest having about 4000 reactions. The routine was able to find the solution with a computational time similar to the time used by a numerical singular value decomposition routine. As an advantage of symbolic solution, a set of independent fluxes can be suggested by the researcher leading to the formation of a desired flux basis describing the steady state solution of the network. These independent fluxes can be constrained using experimental data. We demonstrate the application of constraints by calculating a flux distribution for the central metabolic and amino acid biosynthesis pathways of yeast. We were able to find symbolic solutions for the steady state flux distribution of large metabolic networks. The ability to choose a flux basis was found to be useful in the constraint process and provides a strong argument for using symbolic Gauss-Jordan elimination in place of singular value decomposition.

  9. Micro and nanofluidic structures for cell sorting and genomic analysis

    Science.gov (United States)

    Morton, Keith J.

    Microfluidic systems promise rapid analysis of small samples in a compact and inexpensive format. But direct scaling of lab bench protocols on-chip is challenging because laminar flows in typical microfluidic devices are characterized by non-mixing streamlines. Common microfluidic mixers and sorters work by diffusion, limiting application to objects that diffuse slowly such as cells and DNA. Recently Huang et.al. developed a passive microfluidic element to continuously separate bio-particles deterministically. In Deterministic Lateral Displacement (DLD), objects are sorted by size as they transit an asymmetric array of microfabricated posts. This thesis further develops DLD arrays with applications in three broad new areas. First the arrays are used, not simply to sort particles, but to move streams of cells through functional flows for chemical treatment---such as on-chip immunofluorescent labeling of blood cells with washing, and on-chip E.coli cell lysis with simultaneous chromosome extraction. Secondly, modular tiling of the basic DLD element is used to construct complex particle handling modes that include beam steering for jets of cells and beads. Thirdly, nanostructured DLD arrays are built using Nanoimprint Lithography (NIL) and continuous-flow separation of 100 nm and 200 nm size particles is demonstrated. Finally a number of ancillary nanofabrication techniques were developed in support of these overall goals, including methods to interface nanofluidic structures with standard microfluidic components such as inlet channels and reservoirs, precision etching of ultra-high aspect ratio (>50:1) silicon nanostructures, and fabrication of narrow (˜ 35 nm) channels used to stretch genomic length DNA.

  10. Draft genome sequence and detailed analysis of Pantoea eucrina strain Russ and implication for opportunistic pathogenesis

    Directory of Open Access Journals (Sweden)

    Farzaneh Moghadam

    2016-12-01

    Full Text Available The genus Pantoea is a predominant member of host-associated microbiome. We here report on the genomic analysis of Pantoea eucrina strain Russ that was isolated from a trashcan at Oklahoma State University, Stillwater, OK. The draft genome of Pantoea eucrina strain Russ consists of 3,939,877 bp of DNA with 3704 protein-coding genes and 134 RNA genes. This is the first report of a genome sequence of a member of Pantoea eucrina. Genomic analysis revealed metabolic versatility with genes involved in the metabolism and transport of all amino acids as well as glucose, fructose, mannose, xylose, arabinose and galactose, suggesting the organism is a versatile heterotroph. The genome also encodes an extensive secretory machinery including types I, II, III, IV, and Vb secretion systems, and several genes for pili production including the new usher/chaperone system (pfam 05,229. The implications of these systems for opportunistic pathogenesis are discussed.

  11. DEVELOPMENT OF NEW SEQUENCING TECHNOLOGIES AND THEIR APPLICATION IN GENOME ANALYSIS OF DOMESTIC ANIMALS

    Directory of Open Access Journals (Sweden)

    Kristina Gvozdanović

    2015-12-01

    Full Text Available Sequencing and detailed study of the genom of domestic animals began in the middle of the last century. It was primarily referred to development of the first generation sequencing methods, i.e. Sanger sequencing method. Next generation sequencing methods are currently the most common methods in the analysis of domestic animals genom. The application of these methods gave us up to 100 time more data in comparison with Sanger method. Analyses including RNA sequencing, genotyping of whole genome, immunoprecipitation associated with DNA microarrays, detection ofmutations and inherited diseases, sequencing ofthemitochondrial genome and many others have been conducted with development and application of new sequencing methods since 2005 until today. Application of new sequencing methods in the analysis ofdomestic animal genome provides better understanding of the genetic basis for important production traits which could help in improving the livestock production.

  12. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  13. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate

    Directory of Open Access Journals (Sweden)

    Freddy Asenjo

    2016-04-01

    Full Text Available Background. The honey bee (Apis mellifera is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2 from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and

  14. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate.

    Science.gov (United States)

    Asenjo, Freddy; Olmos, Alejandro; Henríquez-Piskulich, Patricia; Polanco, Victor; Aldea, Patricia; Ugalde, Juan A; Trombert, Annette N

    2016-01-01

    Background. The honey bee (Apis mellifera) is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2) from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and replication components

  15. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate

    Science.gov (United States)

    Asenjo, Freddy; Olmos, Alejandro; Henríquez-Piskulich, Patricia; Polanco, Victor; Aldea, Patricia

    2016-01-01

    Background. The honey bee (Apis mellifera) is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2) from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and replication components

  16. Flow cytometric analysis of oil palm: a preliminary analysis for cultivars and genomic DNA alteration

    Directory of Open Access Journals (Sweden)

    Warawut Chuthammathat

    2005-12-01

    Full Text Available DNA contents of oil palm (Elaeis guineensis Jacq. cultivars were analyzed by flow cytometry using different external reference plant species. Analysis using corn (Zea mays line CE-777 as a reference plant gave the highest DNA content of oil palm (4.72±0.23 pg 2C-1 whereas the DNA content was found to be lower when using soybean (Glycine max cv. Polanka (3.77±0.09 pg 2C-1 or tomato (Lycopersicon esculentum cv. Stupicke (4.25±0.09 pg 2C-1 as a reference. The nuclear DNA contents of Dura (D109, Pisifera (P168 and Tenera (T38 cultivars were 3.46±0.04, 3.24±0.03 and 3.76±0.04 pg 2C-1 nuclei, respectively, using soybean as a reference. One haploid genome of oil palm therefore ranged from 1.56 to 1.81±109 base pairs. DNA contents from one-year-old calli and cell suspension of oil palm were found to be significantly different from those of seedlings. It thus should be noted that genomic DNA alteration occurred in these cultured tissues. We therefore confirm that flow cytometric analysis could verify cultivars, DNA content and genomic DNA alteration of oil palm using soybean as an external reference standard.

  17. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    Energy Technology Data Exchange (ETDEWEB)

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  18. Complete genome sequence of Nitrobacter hamburgensis X14 and comparative genomic analysis of species within the genus Nitrobacter.

    Energy Technology Data Exchange (ETDEWEB)

    Starkenburg, Shawn R [Oregon State University; Larimer, Frank W [ORNL; Stein, Lisa Y [University of California, Riverside; Klotz, Martin G [University of Louisville, Louisville; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Sayavedra-Soto, LA [Oregon State University; Poret-Peterson, Amisha T. [University of Louisville, Louisville; Gentry, ME [University of Louisville, Louisville; Arp, D J [Oregon State University; Ward, Bess B. [Princeton University; Bottomley, Peter J [Oregon State University

    2008-05-01

    The alphaproteobacterium Nitrobacter hamburgensis X14 is a gram-negative facultative chemolithoautotroph that conserves energy from the oxidation of nitrite to nitrate. Sequencing and analysis of the Nitrobacter hamburgensis X14 genome revealed four replicons comprised of one chromosome (4.4 Mbp) and three plasmids (294, 188, and 121 kbp). Over 20% of the genome is composed of pseudogenes and paralogs. Whole-genome comparisons were conducted between N. hamburgensis and the finished and draft genome sequences of Nitrobacter winogradskyi and Nitrobacter sp. strain Nb-311A, respectively. Most of the plasmid-borne genes were unique to N. hamburgensis and encode a variety of functions (central metabolism, energy conservation, conjugation, and heavy metal resistance), yet approximately 21 kb of a approximately 28-kb "autotrophic" island on the largest plasmid was conserved in the chromosomes of Nitrobacter winogradskyi Nb-255 and Nitrobacter sp. strain Nb-311A. The N. hamburgensis chromosome also harbors many unique genes, including those for heme-copper oxidases, cytochrome b(561), and putative pathways for the catabolism of aromatic, organic, and one-carbon compounds, which help verify and extend its mixotrophic potential. A Nitrobacter "subcore" genome was also constructed by removing homologs found in strains of the closest evolutionary relatives, Bradyrhizobium japonicum and Rhodopseudomonas palustris. Among the Nitrobacter subcore inventory (116 genes), copies of genes or gene clusters for nitrite oxidoreductase (NXR), cytochromes associated with a dissimilatory nitrite reductase (NirK), PII-like regulators, and polysaccharide formation were identified. Many of the subcore genes have diverged significantly from, or have origins outside, the alphaproteobacterial lineage and may indicate some of the unique genetic requirements for nitrite oxidation in Nitrobacter.

  19. Genome-wide association and admixture analysis of glaucoma in the Women's Health Initiative.

    Science.gov (United States)

    Hoffmann, Thomas J; Tang, Hua; Thornton, Timothy A; Caan, Bette; Haan, Mary; Millen, Amy E; Thomas, Fridtjof; Risch, Neil

    2014-12-15

    We report a genome-wide association study (GWAS) and admixture analysis of glaucoma in 12 008 African-American and Hispanic women (age 50-79 years) from the Women's Health Initiative (WHI). Although GWAS of glaucoma have been conducted on several populations, this is the first to look at glaucoma in individuals of African-American and Hispanic race/ethnicity. Prevalent and incident glaucoma was determined by self-report from study questionnaires administered at baseline (1993-1998) and annually through 2005. For African Americans, there was a total of 658 prevalent cases, 1062 incident cases and 6067 individuals who never progressed to glaucoma. For our replication cohort, we used the WHI Hispanics, including 153 prevalent cases, 336 incident cases and 2685 non-cases. We found an association of African ancestry with glaucoma incidence in African Americans (hazards ratio 1.62, 95% CI 1.023-2.56, P = 0.038) and in Hispanics (hazards ratio 3.21, 95% CI 1.32-7.80, P = 0.011). Although we found that no previously identified glaucoma SNPs replicated in either the WHI African Americans or Hispanics, a risk score combining all previously reported hits was significant in African-American prevalent cases (P = 0.0046), and was in the expected direction in the incident cases, as well as in the Hispanic incident cases. Additionally, after imputing to 1000 Genomes, two less common independent SNPs were suggestive in African Americans, but had too low of an allele frequency in Hispanics to test for replication. These results suggest the possibility of a distinct genetic architecture underlying glaucoma in individuals of African ancestry.

  20. Drawing the line between commensal and pathogenic Gardnerella vaginalis through genome analysis and virulence studies

    Directory of Open Access Journals (Sweden)

    Girerd Philippe H

    2010-06-01

    Full Text Available Abstract Background Worldwide, bacterial vaginosis (BV is the most common vaginal disorder. It is associated with risk for preterm birth and HIV infection. The etiology of the condition has been debated for nearly half a century and the lack of knowledge about its cause and progression has stymied efforts to improve therapy and prevention. Gardnerella vaginalis was originally identified as the causative agent, but subsequent findings that it is commonly isolated from seemingly healthy women cast doubt on this claim. Recent studies shedding light on the virulence properties of G. vaginalis, however, have drawn the species back into the spotlight. Results In this study, we sequenced the genomes of a strain of G. vaginalis from a healthy woman, and one from a woman with bacterial vaginosis. Comparative analysis of the genomes revealed significant divergence and in vitro studies indicated disparities in the virulence potential of the two strains. The commensal isolate exhibited reduced cytotoxicity and yet the cytolysin proteins encoded by the two strains were nearly identical, differing at a single amino acid, and were transcribed at similar levels. The BV-associated strain encoded a different variant of a biofilm associated protein gene and demonstrated greater adherence, aggregation, and biofilm formation. Using filters with different pore sizes, we found that direct contact between the bacteria and epithelial cells is required for cytotoxicity. Conclusions The results indicated that contact is required for cytotoxicity and suggested that reduced cytotoxicity in the commensal isolate could be due to impaired adherence. This study outlines two distinct genotypic variants of G. vaginalis, one apparently commensal and one pathogenic, and presents evidence for disparate virulence potentials.

  1. Analysis of CR1 Repeats in the Zebra Finch Genome

    Directory of Open Access Journals (Sweden)

    George E. Liu

    2013-06-01

    Full Text Available Most bird species have smaller genomes and fewer repeats than mammals. Chicken Repeat 1 (CR1 repeat is one of the most abundant families of repeats, ranging from ~133,000 to ~187,000 copies accounting for ~50 to ~80% of the interspersed repeats in the zebra finch and chicken genomes, respectively. CR1 repeats are believed to have arisen from the retrotransposition of a small number of master elements, which gave rise to multiple CR1 subfamilies in the chicken. In this study, we performed a global assessment of the divergence distributions, phylogenies, and consensus sequences of CR1 repeats in the zebra finch genome. We identified and validated 34 CR1 subfamilies and further analyzed the correlation between these subfamilies. We also discovered 4 novel lineage-specific CR1 subfamilies in the zebra finch when compared to the chicken genome. We built various evolutionary trees of these subfamilies and concluded that CR1 repeats may play an important role in reshaping the structure of bird genomes.

  2. Simultaneous comprehensive multiplex autoantibody analysis for rapidly progressive glomerulonephritis.

    Science.gov (United States)

    Sowa, Mandy; Trezzi, Barbara; Hiemann, Rico; Schierack, Peter; Grossmann, Kai; Scholz, Juliane; Somma, Valentina; Sinico, Renato Alberto; Roggenbuck, Dirk; Radice, Antonella

    2016-11-01

    Rapidly progressive glomerulonephritis (RPGN) is mainly caused by anti-glomerular basement membrane (GBM) antibody-mediated glomerulonephritis, immune-complex or anti-neutrophil cytoplasmic antibody (ANCA)-associated vasculitides and leads to rapid loss of renal function. Detection of ANCA and autoantibodies (autoAbs) to GBM and dsDNA enables early diagnosis and appropriate treatment of RPGN aiding in preventing end-stage renal disease.Determination of ANCA on neutrophils (ANCA) as well as autoAbs to myeloperoxidase (MPO-ANCA), proteinase 3 (PR3-ANCA), GBM, and dsDNA was performed by the novel multiplex CytoBead technology combining cell- and microbead-based autoAb analyses by automated indirect immunofluorescence (IIF). Forty patients with granulomatosis with polyangiitis (GPA), 48 with microscopic polyangiitis (MPA), 2 with eosinophilic GPA, 42 with systemic lupus erythematosus (SLE), 43 with Goodpasture syndrome (GPS), 57 with infectious diseases (INF), and 55 healthy subjects (HS) were analyzed and findings compared with classical single testing.The CytoBead assay revealed for GPA, MPA, GPS, and SLE the following diagnostic sensitivities and for HS and INF the corresponding specificities: PR3-ANCA, 85.0% and 100.0%; MPO-ANCA, 77.1% and 99.1%; anti-GBM autoAb, 88.4% and 96.4%; anti-dsDNA autoAb, 83.3% and 97.3%; ANCA, 91.1% and 99.1%, respectively. Agreement with classical enzyme-linked immunosorbent assay and IIF was very good for anti-GBM autoAb, MPO-ANCA, PR3-ANCA, and ANCA, respectively. Anti-dsDNA autoAb comparative analysis demonstrated fair agreement only and a significant difference (P = 0.0001).The CytoBead technology provides a unique multiplex reaction environment for simultaneous RPGN-specific autoAb testing. CytoBead RPGN assay is a promising alternative to time-consuming single parameter analysis and, thus, is well suited for emergency situations.

  3. Simultaneous comprehensive multiplex autoantibody analysis for rapidly progressive glomerulonephritis

    Science.gov (United States)

    Sowa, Mandy; Trezzi, Barbara; Hiemann, Rico; Schierack, Peter; Grossmann, Kai; Scholz, Juliane; Somma, Valentina; Sinico, Renato Alberto; Roggenbuck, Dirk; Radice, Antonella

    2016-01-01

    Abstract Rapidly progressive glomerulonephritis (RPGN) is mainly caused by anti-glomerular basement membrane (GBM) antibody-mediated glomerulonephritis, immune-complex or anti-neutrophil cytoplasmic antibody (ANCA)-associated vasculitides and leads to rapid loss of renal function. Detection of ANCA and autoantibodies (autoAbs) to GBM and dsDNA enables early diagnosis and appropriate treatment of RPGN aiding in preventing end-stage renal disease. Determination of ANCA on neutrophils (ANCA) as well as autoAbs to myeloperoxidase (MPO-ANCA), proteinase 3 (PR3-ANCA), GBM, and dsDNA was performed by the novel multiplex CytoBead technology combining cell- and microbead-based autoAb analyses by automated indirect immunofluorescence (IIF). Forty patients with granulomatosis with polyangiitis (GPA), 48 with microscopic polyangiitis (MPA), 2 with eosinophilic GPA, 42 with systemic lupus erythematosus (SLE), 43 with Goodpasture syndrome (GPS), 57 with infectious diseases (INF), and 55 healthy subjects (HS) were analyzed and findings compared with classical single testing. The CytoBead assay revealed for GPA, MPA, GPS, and SLE the following diagnostic sensitivities and for HS and INF the corresponding specificities: PR3-ANCA, 85.0% and 100.0%; MPO-ANCA, 77.1% and 99.1%; anti-GBM autoAb, 88.4% and 96.4%; anti-dsDNA autoAb, 83.3% and 97.3%; ANCA, 91.1% and 99.1%, respectively. Agreement with classical enzyme-linked immunosorbent assay and IIF was very good for anti-GBM autoAb, MPO-ANCA, PR3-ANCA, and ANCA, respectively. Anti-dsDNA autoAb comparative analysis demonstrated fair agreement only and a significant difference (P = 0.0001). The CytoBead technology provides a unique multiplex reaction environment for simultaneous RPGN-specific autoAb testing. CytoBead RPGN assay is a promising alternative to time-consuming single parameter analysis and, thus, is well suited for emergency situations. PMID:27858870

  4. The first complete chloroplast genome sequences of Ulmus species by de novo sequencing: Genome comparative and taxonomic position analysis

    Science.gov (United States)

    Zhang, Shuang; Yu, Xiao-Yue; Ren, Ya-Chao; Yang, Min-Sheng; Wang, Jin-Mao

    2017-01-01

    further analysis of their nuclear genomes. This study is the first report on Ulmus chloroplast genomes, which has significance for understanding photosynthesis, evolution, and chloroplast transgenic engineering. PMID:28158318

  5. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    Science.gov (United States)

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  6. Parallel WGA and WTA for Comparative Genome and Transcriptome NGS Analysis Using Tiny Cell Numbers.

    Science.gov (United States)

    Korfhage, Christian; Fricke, Evelyn; Meier, Andreas

    2015-07-01

    Genomic DNA determines how and when the transcriptome is changed by a trigger or environmental change and how cellular metabolism is influenced. Comparative genome and transcriptome analysis of the same cell sample links a defined genome with all changes in the bases, structure, or numbers of the transcriptome. However, comparative genome and transcriptome analysis using next-generation sequencing (NGS) or real-time PCR is often limited by the small amount of sample available. In mammals, the amount of DNA and RNA in a single cell is ∼10 picograms, but deep analysis of the genome and transcriptome currently requires several hundred nanograms of nucleic acids for library preparation for NGS sequencing. Consequently, accurate whole-genome amplification (WGA) and whole-transcriptome amplification (WTA) is required for such quantitative analysis. This unit describes how the genome and the transcriptome of a tiny number of cells can be amplified in a highly parallel and comparable process. Protocols for quality control of amplified DNA and application of amplified DNA for NGS are included.

  7. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  8. GENOME SIZE DETERMINATION AND RAPD ANALYSIS OF FOUR EDIBLE AROIDS OF NORTH EAST INDIA

    Directory of Open Access Journals (Sweden)

    Jyoti P. Saikia1*, Bolin K. Konwar 2 and Susmita Singh3

    2010-10-01

    Full Text Available Four edible aroid species were selected for the study. The genomic DNA of the plants was isolated and estimated. A part of the genomic DNA was used for analysis using six different primers from Operon Technologies, USA. The genome size determined for the aroids is in the order of Colocasia esculenta> Xanthosoma caracu> Xanthosoma sagittifolium > Amorphophallus paeonifolius. Amorphophallus species was found to be 50% similar to both Xanthosoma caracu and Colocasia esculenta. The analysis will provide a ground for exploring the vast diversified aroid population of the region.

  9. The Challenges of Genome Analysis in the Health Care Setting

    Directory of Open Access Journals (Sweden)

    Anneke Lucassen

    2014-07-01

    Full Text Available Genome sequencing is now a sufficiently mature and affordable technology for clinical use. Its application promises not only to transform clinicians’ diagnostic and predictive ability, but also to improve preventative therapies, surveillance regimes, and tailor patient treatment to an individual’s genetic make-up. However, as with any technological advance, there are associated fresh challenges. While some of the ethical, legal and social aspects resulting from the generation of data from genome sequencing are generic, several nuances are unique. Since the UK government recently announced plans to sequence the genomes of 100,000 Health Service patients, and similar initiatives are being considered elsewhere, a discussion of these nuances is timely and needs to go hand in hand with formulation of guidelines and public engagement activities around implementation of sequencing in clinical practice.

  10. Comparative Genome Analysis of Lolium-Festuca Complex Species

    DEFF Research Database (Denmark)

    Czaban, Adrian; Byrne, Stephen; Sharma, Sapna

    2015-01-01

    The Lolium-Festuca complex incorporates species from the Lolium genera and the broad leaf Fescues. Plants belonging to this complex exhibit significant phenotypic plasticity for agriculturally important traits, such as annuality/perenniality, establishment potential, growth speed, nutritional value......, winter hardiness, drought tolerance and resistance to grazing. In this study we have sequenced and assembled the low copy fraction of the genomes of Lolium westerwoldicum, Lolium multiflorum, Festuca pratensis and Lolium temulentum. We have also generated de-novo transcriptome assemblies for each species......, and these have aided in the annotation of the genomic sequence. Using this data we were able to generate annotated assemblies of the gene rich regions of the four species to complement the already sequenced Lolium perenne genome. Using these gene models we have identified orthologous genes between the species...

  11. Power analysis for genome-wide association studies

    Directory of Open Access Journals (Sweden)

    Klein Robert J

    2007-08-01

    Full Text Available Abstract Background Genome-wide association studies are a promising new tool for deciphering the genetics of complex diseases. To choose the proper sample size and genotyping platform for such studies, power calculations that take into account genetic model, tag SNP selection, and the population of interest are required. Results The power of genome-wide association studies can be computed using a set of tag SNPs and a large number of genotyped SNPs in a representative population, such as available through the HapMap project. As expected, power increases with increasing sample size and effect size. Power also depends on the tag SNPs selected. In some cases, more power is obtained by genotyping more individuals at fewer SNPs than fewer individuals at more SNPs. Conclusion Genome-wide association studies should be designed thoughtfully, with the choice of genotyping platform and sample size being determined from careful power calculations.

  12. Ten years of maintaining and expanding a microbial genome and metagenome analysis system.

    Science.gov (United States)

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2015-11-01

    Launched in March 2005, the Integrated Microbial Genomes (IMG) system is a comprehensive data management system that supports multidimensional comparative analysis of genomic data. At the core of the IMG system is a data warehouse that contains genome and metagenome datasets sequenced at the Joint Genome Institute or provided by scientific users, as well as public genome datasets available at the National Center for Biotechnology Information Genbank sequence data archive. Genomes and metagenome datasets are processed using IMG's microbial genome and metagenome sequence data processing pipelines and are integrated into the data warehouse using IMG's data integration toolkits. Microbial genome and metagenome application specific data marts and user interfaces provide access to different subsets of IMG's data and analysis toolkits. This review article revisits IMG's original aims, highlights key milestones reached by the system during the past 10 years, and discusses the main challenges faced by a rapidly expanding system, in particular the complexity of maintaining such a system in an academic setting with limited budgets and computing and data management infrastructure.

  13. Chloroplast genome analysis of Australian eucalypts--Eucalyptus, Corymbia, Angophora, Allosyncarpia and Stockwellia (Myrtaceae).

    Science.gov (United States)

    Bayly, Michael J; Rigault, Philippe; Spokevicius, Antanas; Ladiges, Pauline Y; Ades, Peter K; Anderson, Charlotte; Bossinger, Gerd; Merchant, Andrew; Udovicic, Frank; Woodrow, Ian E; Tibbits, Josquin

    2013-12-01

    We present a phylogenetic analysis and comparison of structural features of chloroplast genomes for 39 species of the eucalypt group (genera Eucalyptus, Corymbia, Angophora, and outgroups Allosyncarpia and Stockwellia). We use 41 complete chloroplast genome sequences, adding 39 finished-quality chloroplast genomes to two previously published genomes. Maximum parsimony and Bayesian analyses, based on >7000 variable nucleotide positions, produced one fully resolved phylogenetic tree (35 supported nodes, 27 with 100% bootstrap support). Eucalyptus and its sister lineage Angophora+Corymbia show a deep divergence. Within Eucalyptus, three lineages are resolved: the 'eudesmid', 'symphyomyrt' and 'monocalypt' groups. Corymbia is paraphyletic with respect to Angophora. Gene content and order do not vary among eucalypt chloroplasts; length mutations, especially frame shifts, are uncommon in protein-coding genes. Some non-synonymous mutations are highly incongruent with the overall phylogenetic signal, notably in rbcL, and may be adaptive. Application of custom informatics pipelines (GYDLE Inc.) enabled direct chloroplast genome assembly, resolving each genome to finished-quality with no need for PCR gap-filling or contig order resolution. Analysis of whole chloroplast genomes resolved major eucalypt clades and revealed variable regions of the genome that will be useful in lower-level genetic studies (including phylogeography and geneflow).

  14. The Genome of Nosema sp. Isolate YNPr: A Comparative Analysis of Genome Evolution within the Nosema/Vairimorpha Clade

    Science.gov (United States)

    Ma, Zhenggang; Li, Tian; Zhang, Xiaoyan; Debrunner-Vossbrinck, Bettina A.; Zhou, Zeyang; Vossbrinck, Charles R.

    2016-01-01

    The microsporidian parasite designated here as Nosema sp. Isolate YNPr was isolated from the cabbage butterfly Pieris rapae collected in Honghe Prefecture, Yunnan Province, China. The genome was sequenced by Illumina sequencing and compared to those of two related members of the Nosema/Vairimorpha clade, Nosema ceranae and Nosema apis. Based upon assembly statistics, the Nosema sp. YNPr genome is 3.36 x 106bp with a G+C content of 23.18% and 2,075 protein coding sequences. An “ACCCTT” motif is present approximately 50-bp upstream of the start codon, as reported from other members of the clade and from Encephalitozoon cuniculi, a sister taxon. Comparative small subunit ribosomal DNA (SSU rDNA) analysis as well as genome-wide phylogenetic analysis confirms a closer relationship between N. ceranae and Nosema sp. YNPr than between the two honeybee parasites N. ceranae and N. apis. The more closely related N. ceranae and Nosema sp. YNPr show similarities in a number of structural characteristics such as gene synteny, gene length, gene number, transposon composition and gene reduction. Based on transposable element content of the assemblies, the transposon content of Nosema sp. YNPr is 4.8%, that of N. ceranae is 3.7%, and that of N. apis is 2.5%, with large differences in the types of transposons present among these 3 species. Gene function annotation indicates that the number of genes participating in most metabolic activities is similar in all three species. However, the number of genes in the transcription, general function, and cysteine protease categories is greater in N. apis than in the other two species. Our studies further characterize the evolution of the Nosema/Vairimorpha clade of microsporidia. These organisms maintain variable but very reduced genomes. We are interested in understanding the effects of genetic drift versus natural selection on genome size in the microsporidia and in developing a testable hypothesis for further studies on the genomic

  15. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    Science.gov (United States)

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution.

  16. Genome-wide analysis reveals a complex pattern of genomic imprinting in mice.

    Directory of Open Access Journals (Sweden)

    Jason B Wolf

    2008-06-01

    Full Text Available Parent-of-origin-dependent gene expression resulting from genomic imprinting plays an important role in modulating complex traits ranging from developmental processes to cognitive abilities and associated disorders. However, while gene-targeting techniques have allowed for the identification of imprinted loci, very little is known about the contribution of imprinting to quantitative variation in complex traits. Most studies, furthermore, assume a simple pattern of imprinting, resulting in either paternal or maternal gene expression; yet, more complex patterns of effects also exist. As a result, the distribution and number of different imprinting patterns across the genome remain largely unexplored. We address these unresolved issues using a genome-wide scan for imprinted quantitative trait loci (iQTL affecting body weight and growth in mice using a novel three-generation design. We identified ten iQTL that display much more complex and diverse effect patterns than previously assumed, including four loci with effects similar to the callipyge mutation found in sheep. Three loci display a new phenotypic pattern that we refer to as bipolar dominance, where the two heterozygotes are different from each other while the two homozygotes are identical to each other. Our study furthermore detected a paternally expressed iQTL on Chromosome 7 in a region containing a known imprinting cluster with many paternally expressed genes. Surprisingly, the effects of the iQTL were mostly restricted to traits expressed after weaning. Our results imply that the quantitative effects of an imprinted allele at a locus depend both on its parent of origin and the allele it is paired with. Our findings also show that the imprinting pattern of a locus can be variable over ontogenetic time and, in contrast to current views, may often be stronger at later stages in life.