WorldWideScience

Sample records for evolution sequence analysis

  1. Evolution Analysis of Simple Sequence Repeats in Plant Genome.

    Science.gov (United States)

    Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

    2015-01-01

    Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.

  2. Rational engineering of enzyme allosteric regulation through sequence evolution analysis.

    Directory of Open Access Journals (Sweden)

    Jae-Seong Yang

    Full Text Available Control of enzyme allosteric regulation is required to drive metabolic flux toward desired levels. Although the three-dimensional (3D structures of many enzyme-ligand complexes are available, it is still difficult to rationally engineer an allosterically regulatable enzyme without decreasing its catalytic activity. Here, we describe an effective strategy to deregulate the allosteric inhibition of enzymes based on the molecular evolution and physicochemical characteristics of allosteric ligand-binding sites. We found that allosteric sites are evolutionarily variable and comprised of more hydrophobic residues than catalytic sites. We applied our findings to design mutations in selected target residues that deregulate the allosteric activity of fructose-1,6-bisphosphatase (FBPase. Specifically, charged amino acids at less conserved positions were substituted with hydrophobic or neutral amino acids with similar sizes. The engineered proteins successfully diminished the allosteric inhibition of E. coli FBPase without affecting its catalytic efficiency. We expect that our method will aid the rational design of enzyme allosteric regulation strategies and facilitate the control of metabolic flux.

  3. Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics

    Directory of Open Access Journals (Sweden)

    Galtier Nicolas

    2006-04-01

    Full Text Available Abstract Background A large number of bioinformatics applications in the fields of bio-sequence analysis, molecular evolution and population genetics typically share input/ouput methods, data storage requirements and data analysis algorithms. Such common features may be conveniently bundled into re-usable libraries, which enable the rapid development of new methods and robust applications. Results We present Bio++, a set of Object Oriented libraries written in C++. Available components include classes for data storage and handling (nucleotide/amino-acid/codon sequences, trees, distance matrices, population genetics datasets, various input/output formats, basic sequence manipulation (concatenation, transcription, translation, etc., phylogenetic analysis (maximum parsimony, markov models, distance methods, likelihood computation and maximization, population genetics/genomics (diversity statistics, neutrality tests, various multi-locus analyses and various algorithms for numerical calculus. Conclusion Implementation of methods aims at being both efficient and user-friendly. A special concern was given to the library design to enable easy extension and new methods development. We defined a general hierarchy of classes that allow the developer to implement its own algorithms while remaining compatible with the rest of the libraries. Bio++ source code is distributed free of charge under the CeCILL general public licence from its website http://kimura.univ-montp2.fr/BioPP.

  4. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  5. MBEToolbox: a Matlab toolbox for sequence data analysis in molecular biology and evolution

    Directory of Open Access Journals (Sweden)

    Xia Xuhua

    2005-03-01

    Full Text Available Abstract Background MATLAB is a high-performance language for technical computing, integrating computation, visualization, and programming in an easy-to-use environment. It has been widely used in many areas, such as mathematics and computation, algorithm development, data acquisition, modeling, simulation, and scientific and engineering graphics. However, few functions are freely available in MATLAB to perform the sequence data analyses specifically required for molecular biology and evolution. Results We have developed a MATLAB toolbox, called MBEToolbox, aimed at filling this gap by offering efficient implementations of the most needed functions in molecular biology and evolution. It can be used to manipulate aligned sequences, calculate evolutionary distances, estimate synonymous and nonsynonymous substitution rates, and infer phylogenetic trees. Moreover, it provides an extensible, functional framework for users with more specialized requirements to explore and analyze aligned nucleotide or protein sequences from an evolutionary perspective. The full functions in the toolbox are accessible through the command-line for seasoned MATLAB users. A graphical user interface, that may be especially useful for non-specialist end users, is also provided. Conclusion MBEToolbox is a useful tool that can aid in the exploration, interpretation and visualization of data in molecular biology and evolution. The software is publicly available at http://web.hku.hk/~jamescai/mbetoolbox/ and http://bioinformatics.org/project/?group_id=454.

  6. Repetitive sequence analysis and karyotyping reveal different genome evolution and speciation of diploid and tetraploid Tripsacum dactyloides

    Directory of Open Access Journals (Sweden)

    Qilin Zhu

    2016-08-01

    Full Text Available In the subtribe Maydeae, Tripsacum and Zea are closely related genera. Tripsacum is a horticultural crop widely used as pasture forage. Previous studies suggested that Tripsacum might play an important role in maize origin and evolution. However, our understanding of the genomics and the evolution of Tripsacum remains limited. In this study, two diploids, T. dactyloides var. meridionale (2n = 36, MR and T. dactyloides (2n = 36, DD, and one tetraploid, T. dactyloides (2n = 72, DL were sequenced by low-coverage genome sequencing followed by graph-based cluster analysis. The results showed that 63.23%, 59.20%, and 61.57% of the respective genome of MR, DD, and DL were repetitive DNA sequence. The proportions of different repetitive sequences varied greatly among the three species. Fluorescence in situ hybridization (FISH analysis of mitotic metaphase chromosomes with satellite repeats as the probes showed that the FISH signal patterns of DL were more similar to that of DD than to that of MR. Comparative analysis of the repeats also showed that DL shared more common repeat families with DD than with MR. Phylogenetic analysis of internal transcribed spacer region sequences further supported the evolutionary relationship among the three species. Repetitive sequences comparison showed that Tripsacum shared more repeat families with Zea than with Coix and Sorghum. Our study sheds new light on the genomics of Tripsacum and differential speciation in the Poaceae family.

  7. Evolution of DNA sequencing.

    Science.gov (United States)

    Tipu, Hamid Nawaz; Shabbir, Ambreen

    2015-03-01

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted in it. Detection of terminated sequences was done radiographically on Polyacrylamide Gel Electrophoresis (PAGE). Improvements that have evolved over time in original Sanger sequencing include replacement of radiography with fluorescence, use of separate fluorescent markers for each nucleotide, use of capillary electrophoresis instead of polyacrylamide gel electrophoresis and then introduction of capillary array electrophoresis. However, this technique suffered from few inherent limitations like decreased sensitivity for low level mutant alleles, complexities in analyzing highly polymorphic regions like Major Histocompatibility Complex (MHC) and high DNA concentrations required. Several Next Generation Sequencing (NGS) technologies have been introduced by Roche, Illumina and other commercial manufacturers that tend to overcome Sanger sequencing limitations and have been reviewed. Introduction of NGS in clinical research and medical diagnostics is expected to change entire diagnostic approach. These include study of cancer variants, detection of minimal residual disease, exome sequencing, detection of Single Nucleotide Polymorphisms (SNPs) and their disease association, epigenetic regulation of gene expression and sequencing of microorganisms genome.

  8. Expressed sequence tags as a tool for phylogenetic analysis of placental mammal evolution.

    Directory of Open Access Journals (Sweden)

    Morgan Kullberg

    Full Text Available BACKGROUND: We investigate the usefulness of expressed sequence tags, ESTs, for establishing divergences within the tree of placental mammals. This is done on the example of the established relationships among primates (human, lagomorphs (rabbit, rodents (rat and mouse, artiodactyls (cow, carnivorans (dog and proboscideans (elephant. METHODOLOGY/PRINCIPAL FINDINGS: We have produced 2000 ESTs (1.2 mega bases from a marsupial mouse and characterized the data for their use in phylogenetic analysis. The sequences were used to identify putative orthologous sequences from whole genome projects. Although most ESTs stem from single sequence reads, the frequency of potential sequencing errors was found to be lower than allelic variation. Most of the sequences represented slowly evolving housekeeping-type genes, with an average amino acid distance of 6.6% between human and mouse. Positive Darwinian selection was identified at only a few single sites. Phylogenetic analyses of the EST data yielded trees that were consistent with those established from whole genome projects. CONCLUSIONS: The general quality of EST sequences and the general absence of positive selection in these sequences make ESTs an attractive tool for phylogenetic analysis. The EST approach allows, at reasonable costs, a fast extension of data sampling from species outside the genome projects.

  9. Complete Chloroplast Genome Sequence of Aquilaria sinensis (Lour. Gilg and the Evolution Analysis within the Malvalesorder

    Directory of Open Access Journals (Sweden)

    Ying eWang

    2016-03-01

    Full Text Available Aquilaria sinensis (Lour. Gilg is an important medicinal woody plant producing agarwood, which is widely used in traditional Chinese medicine. High-throughput sequencing of chloroplast (cp genomes enhanced the understanding about evolutionary relationships within plant families. In this study, we determined the complete cp genome sequences for A. sinensis. The size of the A.sinensis cp genome was 159,565 bp. This genome included a large single-copy region of 87,482 bp, a small single-copy region of 19,857 bp, and a pair of inverted repeats (IRa and IRb of 26,113 bp each. The GC content of the genome was 37.11%. The A.sinensis cp genome encoded 113 functional genes, including 82 protein-coding genes, 27 tRNA genes, and 4 rRNA genes. Seven genes were duplicated in the protein-coding genes, whereas 11 genes were duplicated in the RNA genes. A total of 45 polymorphic simple-sequence repeat loci and 60 pairs of large repeats were identified. Most simple-sequence repeats were located in the noncoding sections of the large single-copy/small single-copy region and exhibited high A/T content. Moreover, 33 pairs of large repeat sequences were located in the protein-coding genes, whereas 27 pairs were located in the intergenic regions. Aquilaria sinensis cp genome bias ended with A/T on the basis of codon usage. The distribution of codon usage in A.sinensis cp genome was most similar to that in the Gonystylus bancanus cp genome. Comparative results of 82 protein-coding genes from 29 species of cp genomes demonstrated that A.sinensis was a sister species to G. bancanus within the Malvales order. Aquilaria sinensis cp genome presented the highest sequence similarity of >90% with the G. bancanus cp genome by using CGView Comparison Tool. This finding strongly supports the placement of A.sinensis as a sister to G. bancanus within the Malvales order. The complete A.sinensis cp genome information will be highly beneficial for further studies on this traditional

  10. Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome evolution between two wheat cultivars

    KAUST Repository

    Thind, Anupriya Kaur

    2018-02-08

    Background: Recent improvements in DNA sequencing and genome scaffolding have paved the way to generate high-quality de novo assemblies of pseudomolecules representing complete chromosomes of wheat and its wild relatives. These assemblies form the basis to compare the evolutionary dynamics of wheat genomes on a megabase-scale. Results: Here, we provide a comparative sequence analysis of the 700-megabase chromosome 2D between two bread wheat genotypes, the old landrace Chinese Spring and the elite Swiss spring wheat line CH Campala Lr22a. There was a high degree of sequence conservation between the two chromosomes. Analysis of large structural variations revealed four large insertions/deletions (InDels) of >100 kb. Based on the molecular signatures at the breakpoints, unequal crossing over and double-strand break repair were identified as the evolutionary mechanisms that caused these InDels. Three of the large InDels affected copy number of NLRs, a gene family involved in plant immunity. Analysis of single nucleotide polymorphism (SNP) density revealed three haploblocks of 8 Mb, 9 Mb and 48 Mb with a 35-fold increased SNP density compared to the rest of the chromosome. Conclusions: This comparative analysis of two high-quality chromosome assemblies enabled a comprehensive assessment of large structural variations. The insight obtained from this analysis will form the basis of future wheat pan-genome studies.

  11. Complete genomic sequence analysis of infectious bronchitis virus Ark DPI strain and its evolution by recombination

    Directory of Open Access Journals (Sweden)

    Gelb Jack

    2008-12-01

    Full Text Available Abstract An infectious bronchitis virus Arkansas DPI (Ark DPI virulent strain was sequenced, analyzed and compared with many different IBV strains and coronaviruses. The genome of Ark DPI consists of 27,620 nucleotides, excluding poly (A tail, and comprises ten open reading frames. Comparative sequence analysis of Ark DPI with other IBV strains shows striking similarity to the Conn, Gray, JMK, and Ark 99, which were circulating during that time period. Furthermore, comparison of the Ark genome with other coronaviruses demonstrates a close relationship to turkey coronavirus. Among non-structural genes, the 5'untranslated region (UTR, 3C-like proteinase (3CLpro and the polymerase (RdRp sequences are 100% identical to the Gray strain. Among structural genes, S1 has 97% identity with Ark 99; S2 has 100% identity with JMK and 96% to Conn; 3b 99%, and 3C to N is 100% identical to Conn strain. Possible recombination sites were found at the intergenic region of spike gene, 3'end of S1 and 3a gene. Independent recombination events may have occurred in the entire genome of Ark DPI, involving four different IBV strains, suggesting that genomic RNA recombination may occur in any part of the genome at number of sites. Hence, we speculate that the Ark DPI strain originated from the Conn strain, but diverged and evolved independently by point mutations and recombination between field strains.

  12. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution

    NARCIS (Netherlands)

    Hillier, L.W.; Miller, W.; Birney, E.; Groenen, M.A.M.; Crooijmans, R.P.M.A.; Aerts, J.; Poel, van der J.J.

    2004-01-01

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome—composed of approximately one billion base pairs of sequence

  13. BAC-end sequences analysis provides first insights into coffee (Coffea canephora P.) genome composition and evolution.

    Science.gov (United States)

    Dereeper, Alexis; Guyot, Romain; Tranchant-Dubreuil, Christine; Anthony, François; Argout, Xavier; de Bellis, Fabien; Combes, Marie-Christine; Gavory, Frederick; de Kochko, Alexandre; Kudrna, Dave; Leroy, Thierry; Poulain, Julie; Rondeau, Myriam; Song, Xiang; Wing, Rod; Lashermes, Philippe

    2013-10-01

    Coffee is one of the world's most important agricultural commodities. Coffee belongs to the Rubiaceae family in the euasterid I clade of dicotyledonous plants, to which the Solanaceae family also belongs. Two bacterial artificial chromosome (BAC) libraries of a homozygous doubled haploid plant of Coffea canephora were constructed using two enzymes, HindIII and BstYI. A total of 134,827 high quality BAC-end sequences (BESs) were generated from the 73,728 clones of the two libraries, and 131,412 BESs were conserved for further analysis after elimination of chloroplast and mitochondrial sequences. This corresponded to almost 13 % of the estimated size of the C. canephora genome. 6.7 % of BESs contained simple sequence repeats, the most abundant (47.8 %) being mononucleotide motifs. These sequences allow the development of numerous useful marker sites. Potential transposable elements (TEs) represented 11.9 % of the full length BESs. A difference was observed between the BstYI and HindIII libraries (14.9 vs. 8.8 %). Analysis of BESs against known coding sequences of TEs indicated that 11.9 % of the genome corresponded to known repeat sequences, like for other flowering plants. The number of genes in the coffee genome was estimated at 41,973 which is probably overestimated. Comparative genome mapping revealed that microsynteny was higher between coffee and grapevine than between coffee and tomato or Arabidopsis. BESs constitute valuable resources for the first genome wide survey of coffee and provide new insights into the composition and evolution of the coffee genome.

  14. Insights into the evolution of the ErbB receptor family and their ligands from sequence analysis

    Directory of Open Access Journals (Sweden)

    Staros James V

    2006-10-01

    Full Text Available Abstract Background In the time since we presented the first molecular evolutionary study of the ErbB family of receptors and the EGF family of ligands, there has been a dramatic increase in genomic sequences available. We have utilized this greatly expanded data set in this study of the ErbB family of receptors and their ligands. Results In our previous analysis we postulated that EGF family ligands could be characterized by the presence of a splice site in the coding region between the fourth and fifth cysteines of the EGF module and the placement of that module near the transmembrane domain. The recent identification of several new ligands for the ErbB receptors supports this characterization of an ErbB ligand; further, applying this characterization to available sequences suggests additional potential ligands for these receptors, the EGF modules from previously identified proteins: interphotoreceptor matrix proteoglycan-2, the alpha and beta subunit of meprin A, and mucins 3, 4, 12, and 17. The newly available sequences have caused some reorganizations of relationships among the ErbB ligand family, but they add support to the previous conclusion that three gene duplication events gave rise to the present family of four ErbB receptors among the tetrapods. Conclusion This study provides strong support for the hypothesis that the presence of an easily identifiable sequence motif can distinguish EGF family ligands from other EGF-like modules and reveals several potential new EGF family ligands. It also raises interesting questions about the evolution of ErbB2 and ErbB3: Does ErbB2 in teleosts function differently from ErbB2 in tetrapods in terms of ligand binding and intramolecular tethering? When did ErbB3 lose kinase activity, and what is the functional significance of the divergence of its kinase domain among teleosts?

  15. Comparative analysis of secreted protein evolution using expressed sequence tags from four poplar leaf rusts (Melampsora spp.

    Directory of Open Access Journals (Sweden)

    Tanguay Philippe

    2010-07-01

    Full Text Available Abstract Background Obligate biotrophs such as rust fungi are believed to establish long-term relationships by modulating plant defenses through a plethora of effector proteins, whose most recognizable feature is the presence of a signal peptide for secretion. Since the phenotypes of these effectors extend to host cells, their genes are expected to be under accelerated evolution stimulated by host-pathogen coevolutionary arms races. Recently, whole genome sequence data has allowed the prediction of secretomes, facilitating the identification of putative effectors. Results We generated cDNA libraries from four poplar leaf rust pathogens (Melampsora spp. and used computational approaches to identify and annotate putative secreted proteins with the aim of uncovering new knowledge about the nature and evolution of the rust secretome. While more than half of the predicted secretome members encoded lineage-specific proteins, similarities with experimentally characterized fungal effectors were also identified. A SAGE analysis indicated a strong stage-specific regulation of transcripts encoding secreted proteins. The average sequence identity of putative secreted proteins to their closest orthologs in the wheat stem rust Puccinia graminis f. sp. tritici was dramatically reduced compared with non-secreted ones. A comparative genomics approach based on homologous gene groups unravelled positive selection in putative members of the secretome. Conclusion We uncovered robust evidence that different evolutionary constraints are acting on the rust secretome when compared to the rest of the genome. These results are consistent with the view that these genes are more likely to exhibit an effector activity and be involved in coevolutionary arms races with host factors.

  16. DNA sequence analysis tells the truth of the origin, propagation, and evolution of chili (red pepper

    Directory of Open Access Journals (Sweden)

    Hye Jeong Yang

    2017-09-01

    The theory that Korean chili (kochu comes from Japan is the biggest culprit responsible for distorting the history and value of Korean chili and kimchi. It has also spawned a number of other idle theories. This paper aims to correct this misconception through scientific analysis and ultimately restore the truth about the history and culture of Korean fermented foods.

  17. Comparative analysis of function and interaction of transcription factors in nematodes: Extensive conservation of orthology coupled to rapid sequence evolution

    Directory of Open Access Journals (Sweden)

    Singh Rama S

    2008-08-01

    Full Text Available Abstract Background Much of the morphological diversity in eukaryotes results from differential regulation of gene expression in which transcription factors (TFs play a central role. The nematode Caenorhabditis elegans is an established model organism for the study of the roles of TFs in controlling the spatiotemporal pattern of gene expression. Using the fully sequenced genomes of three Caenorhabditid nematode species as well as genome information from additional more distantly related organisms (fruit fly, mouse, and human we sought to identify orthologous TFs and characterized their patterns of evolution. Results We identified 988 TF genes in C. elegans, and inferred corresponding sets in C. briggsae and C. remanei, containing 995 and 1093 TF genes, respectively. Analysis of the three gene sets revealed 652 3-way reciprocal 'best hit' orthologs (nematode TF set, approximately half of which are zinc finger (ZF-C2H2 and ZF-C4/NHR types and HOX family members. Examination of the TF genes in C. elegans and C. briggsae identified the presence of significant tandem clustering on chromosome V, the majority of which belong to ZF-C4/NHR family. We also found evidence for lineage-specific duplications and rapid evolution of many of the TF genes in the two species. A search of the TFs conserved among nematodes in Drosophila melanogaster, Mus musculus and Homo sapiens revealed 150 reciprocal orthologs, many of which are associated with important biological processes and human diseases. Finally, a comparison of the sequence, gene interactions and function indicates that nematode TFs conserved across phyla exhibit significantly more interactions and are enriched in genes with annotated mutant phenotypes compared to those that lack orthologs in other species. Conclusion Our study represents the first comprehensive genome-wide analysis of TFs across three nematode species and other organisms. The findings indicate substantial conservation of transcription

  18. Myocardial tagging by Cardiovascular Magnetic Resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications

    Science.gov (United States)

    2011-01-01

    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this

  19. Myocardial tagging by Cardiovascular Magnetic Resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications

    Directory of Open Access Journals (Sweden)

    Ibrahim El-Sayed H

    2011-07-01

    Full Text Available Abstract Cardiovascular magnetic resonance (CMR tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR, scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1 Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM, delay alternating with nutations for tailored excitation (DANTE, and complementary SPAMM (CSPAMM; and 2 Advanced techniques, which include harmonic phase (HARP, displacement encoding with stimulated echoes (DENSE, and strain encoding (SENC. Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention

  20. Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing

    DEFF Research Database (Denmark)

    Gerlinger, Marco; Rowan, Andrew J.; Horswell, Stuart

    2012-01-01

    BACKGROUND: Intratumor heterogeneity may foster tumor evolution and adaptation and hinder personalized-medicine strategies that depend on results from single tumor-biopsy samples.METHODSTo examine intratumor heterogeneity, we performed exome sequencing, chromosome aberration analysis, and ploidy ...

  1. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

    Directory of Open Access Journals (Sweden)

    John P. Jakupciak

    2013-01-01

    Full Text Available Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.

  2. Evolution of the M gene of the influenza A virus in different host species: large-scale sequence analysis

    Directory of Open Access Journals (Sweden)

    Kamigaki Taro

    2009-05-01

    Full Text Available Abstract Background Influenza A virus infects not only humans, but also other species including avian and swine. If a novel influenza A subtype acquires the ability to spread between humans efficiently, it could cause the next pandemic. Therefore it is necessary to understand the evolutionary processes of influenza A viruses in various hosts in order to gain better knowledge about the emergence of pandemic virus. The virus has segmented RNA genome and 7th segment, M gene, encodes 2 proteins. M1 is a matrix protein and M2 is a membrane protein. The M gene may be involved in determining host tropism. Besides, novel vaccines targeting M1 or M2 protein to confer cross subtype protection have been under development. We conducted the present study to investigate the evolution of the M gene by analyzing its sequence in different species. Results Phylogenetic tree revealed host-specific lineages and evolution rates were different among species. Selective pressure on M2 was stronger than that on M1. Selective pressure on M1 for human influenza was stronger than that for avian influenza, as well as M2. Site-by-site analyses identified one site (amino acid position 219 in M1 as positively selected in human. Positions 115 and 121 in M1, at which consensus amino acids were different between human and avian, were under negative selection in both hosts. As to M2, 10 sites were under positive selection in human. Seven sites locate in extracellular domain. That might be due to host's immune pressure. One site (position 27 positively selected in transmembrane domain is known to be associated with drug resistance. And, two sites (positions 57 and 89 locate in cytoplasmic domain. The sites are involved in several functions. Conclusion The M gene of influenza A virus has evolved independently, under different selective pressure on M1 and M2 among different hosts. We found potentially important sites that may be related to host tropism and immune responses. These

  3. Evolution of the M gene of the influenza A virus in different host species: large-scale sequence analysis.

    Science.gov (United States)

    Furuse, Yuki; Suzuki, Akira; Kamigaki, Taro; Oshitani, Hitoshi

    2009-05-29

    Influenza A virus infects not only humans, but also other species including avian and swine. If a novel influenza A subtype acquires the ability to spread between humans efficiently, it could cause the next pandemic. Therefore it is necessary to understand the evolutionary processes of influenza A viruses in various hosts in order to gain better knowledge about the emergence of pandemic virus. The virus has segmented RNA genome and 7th segment, M gene, encodes 2 proteins. M1 is a matrix protein and M2 is a membrane protein. The M gene may be involved in determining host tropism. Besides, novel vaccines targeting M1 or M2 protein to confer cross subtype protection have been under development. We conducted the present study to investigate the evolution of the M gene by analyzing its sequence in different species. Phylogenetic tree revealed host-specific lineages and evolution rates were different among species. Selective pressure on M2 was stronger than that on M1. Selective pressure on M1 for human influenza was stronger than that for avian influenza, as well as M2. Site-by-site analyses identified one site (amino acid position 219) in M1 as positively selected in human. Positions 115 and 121 in M1, at which consensus amino acids were different between human and avian, were under negative selection in both hosts. As to M2, 10 sites were under positive selection in human. Seven sites locate in extracellular domain. That might be due to host's immune pressure. One site (position 27) positively selected in transmembrane domain is known to be associated with drug resistance. And, two sites (positions 57 and 89) locate in cytoplasmic domain. The sites are involved in several functions. The M gene of influenza A virus has evolved independently, under different selective pressure on M1 and M2 among different hosts. We found potentially important sites that may be related to host tropism and immune responses. These sites may be important for evolutional process in different

  4. Differential evolution-simulated annealing for multiple sequence alignment

    Science.gov (United States)

    Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.

    2017-10-01

    Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.

  5. Protein sequence comparison and protein evolution

    Energy Technology Data Exchange (ETDEWEB)

    Pearson, W.R. [Univ. of Virginia, Charlottesville, VA (United States). Dept. of Biochemistry

    1995-12-31

    This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. This tutorial examines how the information conserved during the evolution of a protein molecule can be used to infer reliably homology, and thus a shared proteinfold and possibly a shared active site or function. The authors start by reviewing a geological/evolutionary time scale. Next they look at the evolution of several protein families. During the tutorial, these families will be used to demonstrate that homologous protein ancestry can be inferred with confidence. They also examine different modes of protein evolution and consider some hypotheses that have been presented to explain the very earliest events in protein evolution. The next part of the tutorial will examine the technical aspects of protein sequence comparison. Both optimal and heuristic algorithms and their associated parameters that are used to characterize protein sequence similarities are discussed. Perhaps more importantly, they survey the statistics of local similarity scores, and how these statistics can both be used to improve the selectivity of a search and to evaluate the significance of a match. They them examine distantly related members of three protein families, the serine proteases, the glutathione transferases, and the G-protein-coupled receptors (GCRs). Finally, the discuss how sequence similarity can be used to examine internal repeated or mosaic structures in proteins.

  6. The Role of the Y-Chromosome in the Establishment of Murine Hybrid Dysgenesis and in the Analysis of the Nucleotide Sequence Organization, Genetic Transmission and Evolution of Repeated Sequences.

    Science.gov (United States)

    Nallaseth, Ferez Soli

    The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1

  7. A Topical Trajectory on Survival: an Analysis of Link-Making in a Sequence of Lessons on Evolution

    Science.gov (United States)

    Rocksén, Miranda; Olander, Clas

    2017-04-01

    This study explores the concept of link-making in relation to communicative strategies applied in the teaching and studying of biological evolution. The analysis focused on video recordings of 11 lessons on biological evolution conducted in a Swedish 9th grade class of students aged 15 years. It reveals how the teacher and students connected classroom conversations, the frequency of references to conversations in whole-class settings, and the development of a theme focusing on species survival and extinction. Detailed examples from the data illustrate how this theme developed from its initiation during the first lesson, through discussion and clarification, to its wrapping up during the last lesson. They further illustrate how students made sense of what the teacher said and wrote, and how the teacher postponed issues, explained and developed topics, provided opportunities for link-making, organised the class, motivated students, and checked their understanding. The study's methodological approach offers a way of including several time dimensions within research. Based on our findings, we conclude that the excerpts examined here did succeed in building `islands of coherence' in the co-construction of curricular content. Moreover, the topical trajectory in relation to species survival provided opportunities for constructing a `scientific story' in the classroom.

  8. Nucleotide sequence of the Xdh region in Drosophila pseudoobscura and an analysis of the evolution of synonymous codons.

    Science.gov (United States)

    Riley, M A

    1989-01-01

    The nucleotide sequence of the Xdh region of Drosophila pseudoobscura is presented. The Xdh gene structure and organization are compared with the homologous region in D. melanogaster. This locus is shown to have similar organization in the two species, although an additional intron and three insertion/deletion events are described for the D. pseudoobscura coding region. The encoded proteins are predicted to have very similar charges and hydrophobic/hydrophilic domains even though 11% of the amino acids are different. A gene 5' to Xdh, putative l(3)s12, is suggested from sequence similarity between the species. Synonymous differences at the Xdh locus between the two species are analyzed using a new method described in the preceding paper by Lewontin. This analysis shows that synonymous positions within the Xdh locus are evolving at very different rates, being dependent on level of codon redundancy. A comparison of synonymous divergence between D. melanogaster and D. pseudoobscura in five additional genes reveals variation in the level of synonymous substitution.

  9. Complete genome sequence and evolution analysis of a columbid herpesvirus type 1 from feral pigeon in China.

    Science.gov (United States)

    Guo, Ying; Li, Siwen; Sun, Xiao; He, Ying; Zhao, Hongjing; Wang, Yu; Zhao, Panpan; Xing, Mingwei

    2017-07-01

    Here, we report the genome sequence of a feral pigeon alphaherpesvirus (columbid herpesvirus type 1, CoHV-1), strain HLJ, and compare it with other avian alphaherpesviruses. The CoHV-1 strain HLJ genome is 204,237 bp in length and encodes approximately 130 putative protein-coding genes. Phylogenetically, CoHV-1 complete genome resides in a monophyletic group with the falconid herpesvirus type 1 (FaHV-1) genome, distant from other alphaherpesviruses. Interestingly, the evolutionary analysis of partial genes of CoHV-1 isolated from different organisms and areas (currently accessible on GenBank) indicates that the CoHV-1 HLJ strain isolated from pigeon (Columba livia) is closely related to the strains isolated from peregrine falcon (Falco peregrinus) in Poland and owl (Bubo virginianus) in USA. These results may suggest possible transmission of the virus between different organisms and different geographic areas.

  10. VDJ-Seq: Deep Sequencing Analysis of Rearranged Immunoglobulin Heavy Chain Gene to Reveal Clonal Evolution Patterns of B Cell Lymphoma.

    Science.gov (United States)

    Jiang, Yanwen; Nie, Kui; Redmond, David; Melnick, Ari M; Tam, Wayne; Elemento, Olivier

    2015-12-28

    Understanding tumor clonality is critical to understanding the mechanisms involved in tumorigenesis and disease progression. In addition, understanding the clonal composition changes that occur within a tumor in response to certain micro-environment or treatments may lead to the design of more sophisticated and effective approaches to eradicate tumor cells. However, tracking tumor clonal sub-populations has been challenging due to the lack of distinguishable markers. To address this problem, a VDJ-seq protocol was created to trace the clonal evolution patterns of diffuse large B cell lymphoma (DLBCL) relapse by exploiting VDJ recombination and somatic hypermutation (SHM), two unique features of B cell lymphomas. In this protocol, Next-Generation sequencing (NGS) libraries with indexing potential were constructed from amplified rearranged immunoglobulin heavy chain (IgH) VDJ region from pairs of primary diagnosis and relapse DLBCL samples. On average more than half million VDJ sequences per sample were obtained after sequencing, which contain both VDJ rearrangement and SHM information. In addition, customized bioinformatics pipelines were developed to fully utilize sequence information for the characterization of IgH-VDJ repertoire within these samples. Furthermore, the pipeline allows the reconstruction and comparison of the clonal architecture of individual tumors, which enables the examination of the clonal heterogeneity within the diagnosis tumors and deduction of clonal evolution patterns between diagnosis and relapse tumor pairs. When applying this analysis to several diagnosis-relapse pairs, we uncovered key evidence that multiple distinctive tumor evolutionary patterns could lead to DLBCL relapse. Additionally, this approach can be expanded into other clinical aspects, such as identification of minimal residual disease, monitoring relapse progress and treatment response, and investigation of immune repertoires in non-lymphoma contexts.

  11. Differential genome evolution and speciation of Coix lacryma-jobi L. and Coix aquatica Roxb. hybrid guangxi revealed by repetitive sequence analysis and fine karyotyping.

    Science.gov (United States)

    Cai, Zexi; Liu, Huijun; He, Qunyan; Pu, Mingwei; Chen, Jian; Lai, Jinsheng; Li, Xuexian; Jin, Weiwei

    2014-11-25

    Coix, Sorghum and Zea are closely related plant genera in the subtribe Maydeae. Coix comprises 9-11 species with different ploidy levels (2n = 10, 20, 30, and 40). The exclusively cultivated C. lacryma-jobi L. (2n = 20) is widely used in East and Southeast Asia for food and medicinal applications. Three fertile cytotypes (2n = 10, 20, and 40) have been reported for C. aquatica Roxb. One sterile cytotype (2n = 30) closely related to C. aquatica has been recently found in Guangxi of China. This putative hybrid has been named C. aquatica HG (Hybrid Guangxi). The genome composition and the evolutionary history of C. lacryma-jobi and C. aquatica HG are largely unclear. About 76% of the genome of C. lacryma-jobi and 73% of the genome of C. aquatica HG are repetitive DNA sequences as shown by low coverage genome sequencing followed by similarity-based cluster analysis. In addition, long terminal repeat (LTR) retrotransposable elements are dominant repetitive sequences in these two genomes, and the proportions of many repetitive sequences in whole genome varied greatly between the two species, indicating evolutionary divergence of them. We also found that a novel 102 bp variant of centromeric satellite repeat CentX and two other satellites only appeared in C. aquatica HG. The results from FISH analysis with repeat probe cocktails and the data from chromosomes pairing in meiosis metaphase showed that C. lacryma-jobi is likely a diploidized paleotetraploid species and C. aquatica HG is possibly a recently formed hybrid. Furthermore, C. lacryma-jobi and C. aquatica HG shared more co-existing repeat families and higher sequence similarity with Sorghum than with Zea. The composition and abundance of repetitive sequences are divergent between the genomes of C. lacryma-jobi and C. aquatica HG. The results from fine karyotyping analysis and chromosome pairing suggested diploidization of C. lacryma-jobi during evolution and C. aquatica HG is a recently formed hybrid. The genome

  12. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  13. A Model of Proteostatic Energy Cost and Its Use in Analysis of Proteome Trends and Sequence Evolution

    DEFF Research Database (Denmark)

    Kepp, Kasper Planeta

    2014-01-01

    . The proteostatic selection pressure is stronger at low metabolic rates (i.e. scarce environments) and in hot habitats, explaining proteome adaptations towards rough environments as a question of energy. The model may also explain several trade-offs observed in protein evolution and suggests how protein properties...

  14. An epistemological analysis of the evolution of didactical activities in teaching-learning sequences: the case of fluids

    Science.gov (United States)

    Psillos, D.

    2004-05-01

    In the present paper we propose a theoretical framework for an epistemological modelling of teaching-learning (didactical) activities, which draws on recent studies of scientific practice. We present and analyse the framework, which includes three categories: namely, Cosmos- Evidence-Ideas (CEI). We also apply this framework in order to model a posteriori the didactical activities included in three successive teaching-learning sequences in the field of fluids, developed gradually by the same researchers over several years under evolving dominant approaches to science teaching and learning (transmission, discovery, constructivist). For each sequence we analyse the planned activities included in student and teacher documents in terms of the CEI model. We deduce the suggested links (or lack of them) between the three categories and discuss the opportunities that students would have during science teaching to link in each sequence the world of theories with real things.

  15. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  16. Insights into hominid evolution from the gorilla genome sequence

    Science.gov (United States)

    Scally, Aylwyn; Dutheil, Julien Y.; Hillier, LaDeana W.; Jordan, Greg E.; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H.; Schwalie, Petra C.; Tang, Y. Amy; Ward, Michelle C.; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N.; Ayub, Qasim; Ball, Edward V.; Beal, Kathryn; Bradley, Brenda J.; Chen, Yuan; Clee, Chris M.; Fitzgerald, Stephen; Graves, Tina A.; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K.; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C.; Munch, Kasper; O’Connor, Timothy D.; Phillips, Andrew D.; Prado-Martinez, Javier; Rogers, Anthony S.; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T.; Stenson, Peter D.; Turner, Daniel J.; Vigilant, Linda; Vilella, Albert J.; Whitener, Weldon; Zhu, Baoli; Cooper, David N.; de Jong, Pieter; Dermitzakis, Emmanouil T.; Eichler, Evan E.; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I.; Ning, Zemin; Odom, Duncan T.; Ponting, Chris P.; Quail, Michael A.; Ryder, Oliver A.; Searle, Stephen M.; Warren, Wesley C.; Wilson, Richard K.; Schierup, Mikkel H.; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard

    2012-01-01

    Summary Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555

  17. The genome sequence of Pseudoplusia includens single nucleopolyhedrovirus and an analysis of p26 gene evolution in the baculoviruses.

    Science.gov (United States)

    Craveiro, Saluana R; Inglis, Peter W; Togawa, Roberto C; Grynberg, Priscila; Melo, Fernando L; Ribeiro, Zilda Maria A; Ribeiro, Bergmann M; Báo, Sônia N; Castro, Maria Elita B

    2015-02-25

    Pseudoplusia includens single nucleopolyhedrovirus (PsinSNPV-IE) is a baculovirus recently identified in our laboratory, with high pathogenicity to the soybean looper, Chrysodeixis includens (Lepidoptera: Noctuidae) (Walker, 1858). In Brazil, the C. includens caterpillar is an emerging pest and has caused significant losses in soybean and cotton crops. The PsinSNPV genome was determined and the phylogeny of the p26 gene within the family Baculoviridae was investigated. The complete genome of PsinSNPV was sequenced (Roche 454 GS FLX - Titanium platform), annotated and compared with other Alphabaculoviruses, displaying a genome apparently different from other baculoviruses so far sequenced. The circular double-stranded DNA genome is 139,132 bp in length, with a GC content of 39.3 % and contains 141 open reading frames (ORFs). PsinSNPV possesses the 37 conserved baculovirus core genes, 102 genes found in other baculoviruses and 2 unique ORFs. Two baculovirus repeat ORFs (bro) homologs, bro-a (Psin33) and bro-b (Psin69), were identified and compared with Chrysodeixis chalcites nucleopolyhedrovirus (ChchNPV) and Trichoplusia ni single nucleopolyhedrovirus (TnSNPV) bro genes and showed high similarity, suggesting that these genes may be derived from an ancestor common to these viruses. The homologous repeats (hrs) are absent from the PsinSNPV genome, which is also the case in ChchNPV and TnSNPV. Two p26 gene homologs (p26a and p26b) were found in the PsinSNPV genome. P26 is thought to be required for optimal virion occlusion in the occlusion bodies (OBs), but its function is not well characterized. The P26 phylogenetic tree suggests that this gene was obtained from three independent acquisition events within the Baculoviridae family. The presence of a signal peptide only in the PsinSNPV p26a/ORF-20 homolog indicates distinct function between the two P26 proteins. PsinSNPV has a genomic sequence apparently different from other baculoviruses sequenced so far. The complete

  18. A Markov chain Monte Carlo Expectation Maximization Algorithm for Statistical Analysis of DNA Sequence Evolution with Neighbor-Dependent Substitution Rates

    DEFF Research Database (Denmark)

    Hobolth, Asger

    2008-01-01

    The evolution of DNA sequences can be described by discrete state continuous time Markov processes on a phylogenetic tree. We consider neighbor-dependent evolutionary models where the instantaneous rate of substitution at a site depends on the states of the neighboring sites. Neighbor...

  19. New insights into flavivirus evolution, taxonomy and biogeographic history, extended by analysis of canonical and alternative coding sequences.

    Directory of Open Access Journals (Sweden)

    Gregory Moureau

    Full Text Available To generate the most diverse phylogenetic dataset for the flaviviruses to date, we determined the genomic sequences and phylogenetic relationships of 14 flaviviruses, of which 10 are primarily associated with Culex spp. mosquitoes. We analyze these data, in conjunction with a comprehensive collection of flavivirus genomes, to characterize flavivirus evolutionary and biogeographic history in unprecedented detail and breadth. Based on the presumed introduction of yellow fever virus into the Americas via the transatlantic slave trade, we extrapolated a timescale for a relevant subset of flaviviruses whose evolutionary history, shows that different Culex-spp. associated flaviviruses have been introduced from the Old World to the New World on at least five separate occasions, with 2 different sets of factors likely to have contributed to the dispersal of the different viruses. We also discuss the significance of programmed ribosomal frameshifting in a central region of the polyprotein open reading frame in some mosquito-associated flaviviruses.

  20. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  1. Diversification and adaptive sequence evolution of Caenorhabditis lysozymes (Nematoda: Rhabditidae).

    Science.gov (United States)

    Schulenburg, Hinrich; Boehnisch, Claudia

    2008-04-19

    Lysozymes are important model enzymes in biomedical research with a ubiquitous taxonomic distribution ranging from phages up to plants and animals. Their main function appears to be defence against pathogens, although some of them have also been implicated in digestion. Whereas most organisms have only few lysozyme genes, nematodes of the genus Caenorhabditis possess a surprisingly large repertoire of up to 15 genes. We used phylogenetic inference and sequence analysis tools to assess the evolution of lysozymes from three congeneric nematode species, Caenorhabditis elegans, C. briggsae, and C. remanei. Their lysozymes fall into three distinct clades, one belonging to the invertebrate-type and the other two to the protist-type lysozymes. Their diversification is characterised by (i) ancestral gene duplications preceding species separation followed by maintenance of genes, (ii) ancestral duplications followed by gene loss in some of the species, and (iii) recent duplications after divergence of species. Both ancestral and recent gene duplications are associated in several cases with signatures of adaptive sequence evolution, indicating that diversifying selection contributed to lysozyme differentiation. Current data strongly suggests that genetic diversity translates into functional diversity. Gene duplications are a major source of evolutionary innovation. Our analysis provides an evolutionary framework for understanding the diversification of lysozymes through gene duplication and subsequent differentiation. This information is expected to be of major value in future analysis of lysozyme function and in studies of the dynamics of evolution by gene duplication.

  2. Diversification and adaptive sequence evolution of Caenorhabditis lysozymes (Nematoda: Rhabditidae

    Directory of Open Access Journals (Sweden)

    Boehnisch Claudia

    2008-04-01

    Full Text Available Abstract Background Lysozymes are important model enzymes in biomedical research with a ubiquitous taxonomic distribution ranging from phages up to plants and animals. Their main function appears to be defence against pathogens, although some of them have also been implicated in digestion. Whereas most organisms have only few lysozyme genes, nematodes of the genus Caenorhabditis possess a surprisingly large repertoire of up to 15 genes. Results We used phylogenetic inference and sequence analysis tools to assess the evolution of lysozymes from three congeneric nematode species, Caenorhabditis elegans, C. briggsae, and C. remanei. Their lysozymes fall into three distinct clades, one belonging to the invertebrate-type and the other two to the protist-type lysozymes. Their diversification is characterised by (i ancestral gene duplications preceding species separation followed by maintenance of genes, (ii ancestral duplications followed by gene loss in some of the species, and (iii recent duplications after divergence of species. Both ancestral and recent gene duplications are associated in several cases with signatures of adaptive sequence evolution, indicating that diversifying selection contributed to lysozyme differentiation. Current data strongly suggests that genetic diversity translates into functional diversity. Conclusion Gene duplications are a major source of evolutionary innovation. Our analysis provides an evolutionary framework for understanding the diversification of lysozymes through gene duplication and subsequent differentiation. This information is expected to be of major value in future analysis of lysozyme function and in studies of the dynamics of evolution by gene duplication.

  3. Exploring the correlations between sequence evolution rate and ...

    Indian Academy of Sciences (India)

    2012-10-15

    Oct 15, 2012 ... lecular evolution in hummingbirds. Mol. Biol. Evol. 15 481–491. Britten RJ 1986 Rates of DNA-sequence evolution differ between taxonomic groups. Science 231 1393–1398. Britten RJ and Davidson EH 1971 Repetitive and non-repetitive. DNA sequences and a speculation on origins of evolutionary.

  4. Chloroplast DNA analysis of Tunisian cork oak populations (Quercus suber L.): sequence variations and molecular evolution of the trnL (UAA)-trnF (GAA) region.

    Science.gov (United States)

    Abdessamad, A; Baraket, G; Sakka, H; Ammari, Y; Ksontini, M; Hannachi, A Salhi

    2016-10-24

    Sequences of the trnL-trnF spacer and combined trnL-trnF region in chloroplast DNA of cork oak (Quercus suber L.) were analyzed to detect polymorphisms and to elucidate molecular evolution and demographic history. The aligned sequences varied in length and nucleotide composition. The overall ratio of transition/transversion (ti/tv) of 0.724 for the intergenic spacer and 0.258 for the pooled sequences were estimated, and indicated that transversions are more frequent than transitions. The molecular evolution and demographic history of Q. suber were investigated. Neutrality tests (Tajima's D and Fu and Li) ruled out the null hypothesis of a strictly neutral model, and Fu's Fs and Ramos-Onsins and Rozas' R2 confirmed the recent expansion of cork oak trees, validating its persistency in North Africa since the last glaciation during the Quaternary. The observed uni-modal mismatch distribution and the Harpending's raggedness index confirmed the demographic history model for cork oak. A phylogenetic dendrogram showed that the distribution of Q. suber trees occurs independently of geographical origin, the relief of the population site, and the bioclimatic stages. The molecular history and cytoplasmic diversity suggest that in situ and ex situ conservation strategies can be recommended for preserving landscape value and facing predictable future climatic changes.

  5. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    Directory of Open Access Journals (Sweden)

    Yupeng Zang

    2009-12-01

    Full Text Available A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.

  6. An evolution based biosensor receptor DNA sequence generation algorithm.

    Science.gov (United States)

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.

  7. Analysis of rotavirus species diversity and evolution including the newly determined full-length genome sequences of rotavirus F and G.

    Science.gov (United States)

    Kindler, Eveline; Trojnar, Eva; Heckel, Gerald; Otto, Peter H; Johne, Reimar

    2013-03-01

    Rotaviruses are a leading cause of viral acute gastroenteritis in humans and animals. Eight different rotavirus species (A-H) have been defined based on antigenicity and nucleotide sequence identities of the VP6 gene. Here, the first complete genome sequences of rotavirus F (strain 03V0568) and G (strain 03V0567) with lengths of 18,341 and 18,186bp, respectively, are described. Both viruses have open reading frames for rotavirus proteins VP1 to VP7 and NSP1 to NSP5 located at the 11 genome segments. Nucleotide sequence identities to other rotaviruses ranged between 29.8% (NSP1 gene) and 61.7% (VP1 gene) for rotavirus F and between 29.3% (NSP1-2 gene) and 65.9% (NSP2 gene) for rotavirus G, thus confirming their classification as separate virus species. Encoded proteins revealed remarkable sequence differences among the rotavirus species. In contrast, the non-coding 5'-terminal sequences of the genome segments are highly conserved among all rotavirus species. Different 3'-terminal consensus sequences are found between rotavirus A/D/F, rotavirus C and rotavirus B/G/H. Phylogenetic analyses indicated a separation of rotaviruses in two major clades consisting of rotavirus A/C/D/F and rotavirus B/G/H. Within these clades, rotavirus F mainly clustered with rotavirus D and rotavirus G with rotavirus B. In addition, differentiation among mammalian and avian rotavirus A strains, host-specific evolution of rotavirus B and C as well as an ancient reassortment event between avian rotavirus A and D are indicated by the phylogenetic data. These results underline the high diversity of rotaviruses as a result of a complex evolutionary history. Copyright © 2012 Elsevier B.V. All rights reserved.

  8. Sequence analysis of two alleles reveals that intra-and intergenic recombination played a role in the evolution of the radish fertility restorer (Rfo

    Directory of Open Access Journals (Sweden)

    Budar Françoise

    2010-02-01

    Full Text Available Abstract Background Land plant genomes contain multiple members of a eukaryote-specific gene family encoding proteins with pentatricopeptide repeat (PPR motifs. Some PPR proteins were shown to participate in post-transcriptional events involved in organellar gene expression, and this type of function is now thought to be their main biological role. Among PPR genes, restorers of fertility (Rf of cytoplasmic male sterility systems constitute a peculiar subgroup that is thought to evolve in response to the presence of mitochondrial sterility-inducing genes. Rf genes encoding PPR proteins are associated with very close relatives on complex loci. Results We sequenced a non-restoring allele (L7rfo of the Rfo radish locus whose restoring allele (D81Rfo was previously described, and compared the two alleles and their PPR genes. We identified a ca 13 kb long fragment, likely originating from another part of the radish genome, inserted into the L7rfo sequence. The L7rfo allele carries two genes (PPR-1 and PPR-2 closely related to the three previously described PPR genes of the restorer D81Rfo allele (PPR-A, PPR-B, and PPR-C. Our results indicate that alleles of the Rfo locus have experienced complex evolutionary events, including recombination and insertion of extra-locus sequences, since they diverged. Our analyses strongly suggest that present coding sequences of Rfo PPR genes result from intragenic recombination. We found that the 10 C-terminal PPR repeats in Rfo PPR gene encoded proteins result from the tandem duplication of a 5 PPR repeat block. Conclusions The Rfo locus appears to experience more complex evolution than its flanking sequences. The Rfo locus and PPR genes therein are likely to evolve as a result of intergenic and intragenic recombination. It is therefore not possible to determine which genes on the two alleles are direct orthologs. Our observations recall some previously reported data on pathogen resistance complex loci.

  9. Probing the sequence space available for HIV-1 evolution

    NARCIS (Netherlands)

    ter Brake, Olivier; Von Eije, Karin J.; Berkhout, Ben

    2008-01-01

    We designed a novel experimental approach to probe the sequence space available for HIV-1 evolution. Selective pressure was put on conserved HIV-1 genomic sequences by means of RNA interference (RNAi). Virus escape was monitored in many parallel cultures, and we scored the mutations selected in the

  10. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    KAUST Repository

    Hunt, Paul

    2010-09-16

    Background: Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum.Results: A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme.Conclusions: This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations. 2010 Hunt et al; licensee BioMed Central Ltd.

  11. Marsupial Genome Sequences: Providing Insight into Evolution and Disease

    Directory of Open Access Journals (Sweden)

    Janine E. Deakin

    2012-01-01

    Full Text Available Marsupials (metatherians, with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil, with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences.

  12. Whole genome sequence analysis of the first Australian OXA-48-producing outbreak-associated Klebsiella pneumoniae isolates: the resistome and in vivo evolution.

    Directory of Open Access Journals (Sweden)

    Björn A Espedido

    Full Text Available Whole genome sequencing was used to characterize the resistome of intensive care unit (ICU outbreak-associated carbapenem-resistant K. pneumoniae isolates. Importantly, and of particular concern, the carbapenem-hydrolyzing β-lactamase gene bla(OXA-48 and the extended-spectrum β-lactamase gene bla(CTX-M-14, were identified on a single broad host-range conjugative plasmid. This represents the first report of bla(OXA-48 in Australia and highlights the importance of resistance gene surveillance, as such plasmids can silently spread amongst enterobacterial populations and have the potential to drastically limit treatment options. Furthermore, the in vivo evolution of these isolates was also examined after 18 months of intra-abdominal carriage in a patient that transited through the ICU during the outbreak period. Reflecting the clonality of K. pneumoniae, only 11 single nucleotide polymorphisms (SNPs were accumulated during this time-period and many of these were associated with genes involved in tolerance/resistance to antibiotics, metals or organic solvents, and transcriptional regulation. Collectively, these SNPs are likely to be associated with changes in virulence (at least to some extent that have refined the in vivo colonization capacity of the original outbreak isolate.

  13. Simulating DNA coding sequence evolution with EvolveAGene 3.

    Science.gov (United States)

    Hall, Barry G

    2008-04-01

    Phylogenetic reconstruction based upon multiple alignments of molecular sequences is important to most branches of modern biology and is central to molecular evolution. Understanding the historical relationships among macromolecules depends upon computer programs that implement a variety of analytical methods. Because it is impossible to know those historical relationships with certainty, assessment of the accuracy of methods and the programs that implement them requires the use of programs that realistically simulate the evolution of DNA sequences. EvolveAGene 3 is a realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions, including variable regions of selection intensity within the sequence and variation in intensity of selection over branches. Variation includes base substitutions, insertions, and deletions. To the best of my knowledge, it is the only program available that simulates the evolution of intact coding sequences. Output includes the true tree and true alignments of the resulting coding sequence and corresponding protein sequences. A log file reports the frequencies of each kind of base substitution, the ratio of transition to transversion substitutions, the ratio of indel to base substitution mutations, and the numbers of silent and amino acid replacement mutations. The realism of the data sets has been assessed by comparing the d(N)/d(S) ratio, the ratio of transition to transversion substitutions, and the ratio of indel to base substitution mutations of the simulated data sets with those parameters of real data sets from the "gold standard" BaliBase collection of structural alignments. Results show that the data sets produced by EvolveAGene 3 are very similar to real data sets, and EvolveAGene 3 is therefore a realistic simulation program that can be used to evaluate a variety of programs and methods in molecular evolution.

  14. Understanding Cancer Genome and Its Evolution by Next Generation Sequencing

    DEFF Research Database (Denmark)

    Hou, Yong

    evolution by NGS, we first developed high throughput single cell sequencing (SCS) pipeline on whole exome and trascriptome and updated the pipeline after systematically reviewed the existed single cell whole genome amplification (WGA) and whole transcriptome amplification methods. Using SCS pipeline we...

  15. Determinants of the rate of protein sequence evolution

    Science.gov (United States)

    Zhang, Jianzhi; Yang, Jian-Rong

    2015-01-01

    The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what constitutes functional constraint has remained unclear. The increasing availability of genomic data has allowed for much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses have identified multiple mechanisms behind these observations and demonstrated a prominent role that selection against errors in molecular and cellular processes plays in protein evolution. PMID:26055156

  16. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome.

    Science.gov (United States)

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-10-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Mutation of miRNA target sequences during human evolution

    DEFF Research Database (Denmark)

    Gardner, Paul P; Vinther, Jeppe

    2008-01-01

    It has long-been hypothesized that changes in non-protein-coding genes and the regulatory sequences controlling expression could undergo positive selection. Here we identify 402 putative microRNA (miRNA) target sequences that have been mutated specifically in the human lineage and show that genes...... containing such deletions are more highly expressed than their mouse orthologs. Our findings indicate that some miRNA target mutations are fixed by positive selection and might have been involved in the evolution of human-specific traits....

  18. Structural Approaches to Sequence Evolution Molecules, Networks, Populations

    CERN Document Server

    Bastolla, Ugo; Roman, H. Eduardo; Vendruscolo, Michele

    2007-01-01

    Structural requirements constrain the evolution of biological entities at all levels, from macromolecules to their networks, right up to populations of biological organisms. Classical models of molecular evolution, however, are focused at the level of the symbols - the biological sequence - rather than that of their resulting structure. Now recent advances in understanding the thermodynamics of macromolecules, the topological properties of gene networks, the organization and mutation capabilities of genomes, and the structure of populations make it possible to incorporate these key elements into a broader and deeply interdisciplinary view of molecular evolution. This book gives an account of such a new approach, through clear tutorial contributions by leading scientists specializing in the different fields involved.

  19. Sequence Evolution Under Constraints: Lessons Learned from Sudoku.

    Science.gov (United States)

    Hu, Yucheng; Yang, Gongrong

    2016-10-01

    The complex structures of all proteins in nature are outcomes of a random walk driven by mutation and selection. Reconstructing the fitness landscape staging this process based on first-principle physical rules or experimental measurements is difficult. In this article we turn the popular Sudoku game into an artificial fitness landscape and use it as a model system to study sequence evolution under constraints. The Sudoku rules, which are human-mind friendly, intertwine a rugged landscape for sequences composed of digits, mimicking the functional constraints felt by a tightly folded protein. Simulated evolution reveals interesting properties of the valley-crossing dynamics on this complex landscape. It is found that (i) the mutation accumulation rate during valley-crossing is constant among different evolutionary pathways and depends on the ruggedness of the landscape; (ii) genetic drift and neutral networks play constructive roles during the process of searching for novel functions; and (iii) under strong selection, gene duplication can speed up the evolution by relaxing, but not completely liberating, the redundant copy from selective pressure. Insights gained from this prototype model may help us understand the evolution of real proteins.

  20. Genome sequence of the brown Norway rat yields insights into mammalian evolution

    Energy Technology Data Exchange (ETDEWEB)

    Gibbs, Richard A.; Weinstock, George M.; Metzker, Michael L.; Muzny, Donna M.; Sodergren, Erica J.; Scherer, Steven; Scott, Graham; Steffen, David; Worley, Kim C.; Burch, Paula E.; Okwuonu, Geoffrey; Hines, Sandra; Lewis, Lora; DeRamo, Christine; Delgado, Oliver; Dugan-Rocha, Shannon; Miner, George; Morgan, Margaret; Hawes, Alicia; Gill, Rachel; Holt, Robert A.; Adams, Mark D.; Amanatides, Peter G.; Baden-Tillson, Holly; Barnstead, Mary; Chin, Soo; Evans, Cheryl A.; Ferriera, Steven; Fosler, Carl; Glodek, Anna; Gu, Zhiping; Jennings, Don; Kraft, Cheryl L.; Nguyen, Trixie; Pfannkoch, Cynthia M.; Sitter, Cynthia; Sutton, Granger G.; Venter, J. Craig; Woodage, Trevor; Smith, Douglas; Lee, Hong-Maei; Gustafson, Erik; Cahill, Patrick; Kana, Arnold; Doucette-Stamm, Lynn; Weinstock, Keith; Fechtel, Kim; Weiss, Robert B.; Dunn, Diane M.; Green, Eric D.; Blakesley, Robert W.; Bouffard, Gerard G.; de Jong, Pieter J.; Osoegawa, Kazutoyo; Zhu, Baoli; Marra, Marco; Schein, Jacqueline; Bosdet, Ian; Fjell, Chris; Jones, Steven; Krzywinski, Martin; Mathewson, Carrie; Siddiqui, Asim; Wye, Natasja; McPherson, John; Zhao, Shaying; Fraser, Claire M.; Shetty, Jyoti; Shatsman, Sofiya; Geer, Keita; Chen, Yixin; Abramzon, Sofyia; Nierman, William C.; Havlak, Paul H.; Chen, Rui; Durbin, K. James; Egan, Amy; Ren, Yanru; Song, Xing-Zhi; Li, Bingshan; Liu, Yue; Qin, Xiang; Cawley, Simon; Cooney, A.J.; D' Souza, Lisa M.; Martin, Kirt; Wu, Jia Qian; Gonzalez-Garay, Manuel L.; Jackson, Andrew R.; Kalafus, Kenneth J.; McLeod, Michael P.; Milosavljevic, Aleksandar; Virk, Davinder; Volkov, Andrei; Wheeler, David A.; Zhang, Zhengdong; Bailey, Jeffrey A.; Eichler, Evan E.; Tuzun, Eray; Birney, Ewan; Mongin, Emmanuel; Ureta-Vidal, Abel; Woodwark, Cara; Zdobnov, Evgeny; Bork, Peer; Suyama, Mikita; Torrents, David; Alexandersson, Marina; Trask, Barbara J.; Young, Janet M.; et al.

    2004-02-02

    The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90 percent of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.

  1. Genome sequence of the Brown Norway rat yields insights into mammalian evolution.

    Science.gov (United States)

    Gibbs, Richard A; Weinstock, George M; Metzker, Michael L; Muzny, Donna M; Sodergren, Erica J; Scherer, Steven; Scott, Graham; Steffen, David; Worley, Kim C; Burch, Paula E; Okwuonu, Geoffrey; Hines, Sandra; Lewis, Lora; DeRamo, Christine; Delgado, Oliver; Dugan-Rocha, Shannon; Miner, George; Morgan, Margaret; Hawes, Alicia; Gill, Rachel; Celera; Holt, Robert A; Adams, Mark D; Amanatides, Peter G; Baden-Tillson, Holly; Barnstead, Mary; Chin, Soo; Evans, Cheryl A; Ferriera, Steve; Fosler, Carl; Glodek, Anna; Gu, Zhiping; Jennings, Don; Kraft, Cheryl L; Nguyen, Trixie; Pfannkoch, Cynthia M; Sitter, Cynthia; Sutton, Granger G; Venter, J Craig; Woodage, Trevor; Smith, Douglas; Lee, Hong-Mei; Gustafson, Erik; Cahill, Patrick; Kana, Arnold; Doucette-Stamm, Lynn; Weinstock, Keith; Fechtel, Kim; Weiss, Robert B; Dunn, Diane M; Green, Eric D; Blakesley, Robert W; Bouffard, Gerard G; De Jong, Pieter J; Osoegawa, Kazutoyo; Zhu, Baoli; Marra, Marco; Schein, Jacqueline; Bosdet, Ian; Fjell, Chris; Jones, Steven; Krzywinski, Martin; Mathewson, Carrie; Siddiqui, Asim; Wye, Natasja; McPherson, John; Zhao, Shaying; Fraser, Claire M; Shetty, Jyoti; Shatsman, Sofiya; Geer, Keita; Chen, Yixin; Abramzon, Sofyia; Nierman, William C; Havlak, Paul H; Chen, Rui; Durbin, K James; Egan, Amy; Ren, Yanru; Song, Xing-Zhi; Li, Bingshan; Liu, Yue; Qin, Xiang; Cawley, Simon; Worley, Kim C; Cooney, A J; D'Souza, Lisa M; Martin, Kirt; Wu, Jia Qian; Gonzalez-Garay, Manuel L; Jackson, Andrew R; Kalafus, Kenneth J; McLeod, Michael P; Milosavljevic, Aleksandar; Virk, Davinder; Volkov, Andrei; Wheeler, David A; Zhang, Zhengdong; Bailey, Jeffrey A; Eichler, Evan E; Tuzun, Eray; Birney, Ewan; Mongin, Emmanuel; Ureta-Vidal, Abel; Woodwark, Cara; Zdobnov, Evgeny; Bork, Peer; Suyama, Mikita; Torrents, David; Alexandersson, Marina; Trask, Barbara J; Young, Janet M; Huang, Hui; Wang, Huajun; Xing, Heming; Daniels, Sue; Gietzen, Darryl; Schmidt, Jeanette; Stevens, Kristian; Vitt, Ursula; Wingrove, Jim; Camara, Francisco; Mar Albà, M; Abril, Josep F; Guigo, Roderic; Smit, Arian; Dubchak, Inna; Rubin, Edward M; Couronne, Olivier; Poliakov, Alexander; Hübner, Norbert; Ganten, Detlev; Goesele, Claudia; Hummel, Oliver; Kreitler, Thomas; Lee, Young-Ae; Monti, Jan; Schulz, Herbert; Zimdahl, Heike; Himmelbauer, Heinz; Lehrach, Hans; Jacob, Howard J; Bromberg, Susan; Gullings-Handley, Jo; Jensen-Seaman, Michael I; Kwitek, Anne E; Lazar, Jozef; Pasko, Dean; Tonellato, Peter J; Twigger, Simon; Ponting, Chris P; Duarte, Jose M; Rice, Stephen; Goodstadt, Leo; Beatson, Scott A; Emes, Richard D; Winter, Eitan E; Webber, Caleb; Brandt, Petra; Nyakatura, Gerald; Adetobi, Margaret; Chiaromonte, Francesca; Elnitski, Laura; Eswara, Pallavi; Hardison, Ross C; Hou, Minmei; Kolbe, Diana; Makova, Kateryna; Miller, Webb; Nekrutenko, Anton; Riemer, Cathy; Schwartz, Scott; Taylor, James; Yang, Shan; Zhang, Yi; Lindpaintner, Klaus; Andrews, T Dan; Caccamo, Mario; Clamp, Michele; Clarke, Laura; Curwen, Valerie; Durbin, Richard; Eyras, Eduardo; Searle, Stephen M; Cooper, Gregory M; Batzoglou, Serafim; Brudno, Michael; Sidow, Arend; Stone, Eric A; Venter, J Craig; Payseur, Bret A; Bourque, Guillaume; López-Otín, Carlos; Puente, Xose S; Chakrabarti, Kushal; Chatterji, Sourav; Dewey, Colin; Pachter, Lior; Bray, Nicolas; Yap, Von Bing; Caspi, Anat; Tesler, Glenn; Pevzner, Pavel A; Haussler, David; Roskin, Krishna M; Baertsch, Robert; Clawson, Hiram; Furey, Terrence S; Hinrichs, Angie S; Karolchik, Donna; Kent, William J; Rosenbloom, Kate R; Trumbower, Heather; Weirauch, Matt; Cooper, David N; Stenson, Peter D; Ma, Bin; Brent, Michael; Arumugam, Manimozhiyan; Shteynberg, David; Copley, Richard R; Taylor, Martin S; Riethman, Harold; Mudunuri, Uma; Peterson, Jane; Guyer, Mark; Felsenfeld, Adam; Old, Susan; Mockrin, Stephen; Collins, Francis

    2004-04-01

    The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.

  2. Integrated sequence analysis. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, K.; Pyy, P

    1998-02-01

    The NKS/RAK subprojet 3 `integrated sequence analysis` (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term `methodology` denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  3. Biophysical and structural considerations for protein sequence evolution

    Directory of Open Access Journals (Sweden)

    Grahnen Johan A

    2011-12-01

    Full Text Available Abstract Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model.

  4. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  5. The pig X and Y Chromosomes: structure, sequence, and evolution.

    Science.gov (United States)

    Skinner, Benjamin M; Sargent, Carole A; Churcher, Carol; Hunt, Toby; Herrero, Javier; Loveland, Jane E; Dunn, Matt; Louzada, Sandra; Fu, Beiyuan; Chow, William; Gilbert, James; Austin-Guest, Siobhan; Beal, Kathryn; Carvalho-Silva, Denise; Cheng, William; Gordon, Daria; Grafham, Darren; Hardy, Matt; Harley, Jo; Hauser, Heidi; Howden, Philip; Howe, Kerstin; Lachani, Kim; Ellis, Peter J I; Kelly, Daniel; Kerry, Giselle; Kerwin, James; Ng, Bee Ling; Threadgold, Glen; Wileman, Thomas; Wood, Jonathan M D; Yang, Fengtang; Harrow, Jen; Affara, Nabeel A; Tyler-Smith, Chris

    2016-01-01

    We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes--both single copy and amplified--on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution. © 2016 Skinner et al.; Published by Cold Spring Harbor Laboratory Press.

  6. DNA Sequence Evolution and Rare Homoeologous Conversion in Tetraploid Cotton.

    Directory of Open Access Journals (Sweden)

    Justin T Page

    2016-05-01

    Full Text Available Allotetraploid cotton species are a vital source of spinnable fiber for textiles. The polyploid nature of the cotton genome raises many evolutionary questions as to the relationships between duplicated genomes. We describe the evolution of the cotton genome (SNPs and structural variants with the greatly improved resolution of 34 deeply re-sequenced genomes. We also explore the evolution of homoeologous regions in the AT- and DT-genomes and especially the phenomenon of conversion between genomes. We did not find any compelling evidence for homoeologous conversion between genomes. These findings are very different from other recent reports of frequent conversion events between genomes. We also identified several distinct regions of the genome that have been introgressed between G. hirsutum and G. barbadense, which presumably resulted from breeding efforts targeting associated beneficial alleles. Finally, the genotypic data resulting from this study provides access to a wealth of diversity sorely needed in the narrow germplasm of cotton cultivars.

  7. A branch-heterogeneous model of protein evolution for efficient inference of ancestral sequences.

    Science.gov (United States)

    Groussin, M; Boussau, B; Gouy, M

    2013-07-01

    Most models of nucleotide or amino acid substitution used in phylogenetic studies assume that the evolutionary process has been homogeneous across lineages and that composition of nucleotides or amino acids has remained the same throughout the tree. These oversimplified assumptions are refuted by the observation that compositional variability characterizes extant biological sequences. Branch-heterogeneous models of protein evolution that account for compositional variability have been developed, but are not yet in common use because of the large number of parameters required, leading to high computational costs and potential overparameterization. Here, we present a new branch-nonhomogeneous and nonstationary model of protein evolution that captures more accurately the high complexity of sequence evolution. This model, henceforth called Correspondence and likelihood analysis (COaLA), makes use of a correspondence analysis to reduce the number of parameters to be optimized through maximum likelihood, focusing on most of the compositional variation observed in the data. The model was thoroughly tested on both simulated and biological data sets to show its high performance in terms of data fitting and CPU time. COaLA efficiently estimates ancestral amino acid frequencies and sequences, making it relevant for studies aiming at reconstructing and resurrecting ancestral amino acid sequences. Finally, we applied COaLA on a concatenate of universal amino acid sequences to confirm previous results obtained with a nonhomogeneous Bayesian model regarding the early pattern of adaptation to optimal growth temperature, supporting the mesophilic nature of the Last Universal Common Ancestor.

  8. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  9. Sequence Handling by Sequence Analysis Toolbox v1.0

    DEFF Research Database (Denmark)

    Ingrell, Christian Ravnsborg; Matthiesen, Rune; Jensen, Ole Nørregaard

    2006-01-01

    The fact that mass spectrometry have become a high-throughput method calls for bioinformatic tools for automated sequence handling and prediction. For efficient use of bioinformatic tools, it is important that these tools are integrated or interfaced with each other. The purpose of sequence...... analysis toolbox v1.0 was to have a general purpose sequence analyzing tool that can import sequences obtained by high-throughput sequencing methods. The program includes algorithms for calculation or prediction of isoelectric point, hydropathicity index, transmembrane segments, and glycosylphosphatidyl...

  10. Sequence diversity and evolution of antimicrobial peptides in invertebrates.

    Science.gov (United States)

    Tassanakajon, Anchalee; Somboonwiwat, Kunlaya; Amparyup, Piti

    2015-02-01

    Antimicrobial peptides (AMPs) are evolutionarily ancient molecules that act as the key components in the invertebrate innate immunity against invading pathogens. Several AMPs have been identified and characterized in invertebrates, and found to display considerable diversity in their amino acid sequence, structure and biological activity. AMP genes appear to have rapidly evolved, which might have arisen from the co-evolutionary arms race between host and pathogens, and enabled organisms to survive in different microbial environments. Here, the sequence diversity of invertebrate AMPs (defensins, cecropins, crustins and anti-lipopolysaccharide factors) are presented to provide a better understanding of the evolution pattern of these peptides that play a major role in host defense mechanisms. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Convergent sequence evolution between echolocating bats and dolphins.

    Science.gov (United States)

    Liu, Yang; Cotton, James A; Shen, Bin; Han, Xiuqun; Rossiter, Stephen J; Zhang, Shuyi

    2010-01-26

    Cases of convergent evolution - where different lineages have evolved similar traits independently - are common and have proven central to our understanding of selection. Yet convincing examples of adaptive convergence at the sequence level are exceptionally rare [1]. The motor protein Prestin is expressed in mammalian outer hair cells (OHCs) and is thought to confer high frequency sensitivity and selectivity in the mammalian auditory system [2]. We previously reported that the Prestin gene has undergone sequence convergence among unrelated lineages of echolocating bat [3]. Here we report that this gene has also undergone convergent amino acid substitutions in echolocating dolphins, which group with echolocating bats in a phylogenetic tree of Prestin. Furthermore, we find evidence that these changes were driven by natural selection. Copyright 2010 Elsevier Ltd. All rights reserved.

  12. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  13. Genetic analysis of loop sequences in the let-7 gene family reveal a relationship between loop evolution and multiple isomiRs.

    Directory of Open Access Journals (Sweden)

    Tingming Liang

    Full Text Available While mature miRNAs have been widely studied, the terminal loop sequences are rarely examined despite regulating both primary and mature miRNA functions. Herein, we attempted to understand the evolutionary pattern of loop sequences by analyzing loops in the let-7 gene family. Compared to the stable miRNA length distributions seen in most metazoans, higher metazoan species exhibit a longer length distribution. Examination of these loop sequence length distributions, in addition to phylogenetic tree construction, implicated loop sequences as the main evolutionary drivers in miRNA genes. Moreover, loops from relevant clustered miRNA gene families showed varying length distributions and higher levels of nucleotide divergence, even between homologous pre-miRNA loops. Furthermore, we found that specific nucleotides were dominantly distributed in the 5' and 3' terminal loop ends, which may contribute to the relatively precise cleavage that leads to a stable isomiR expression profile. Overall, this study provides further insight into miRNA processing and maturation and further enriches our understanding of miRNA biogenesis.

  14. Viral sequence analysis from HIV-infected mothers and infants: molecular evolution, diversity, and risk factors for mother-to-child transmission.

    Science.gov (United States)

    Bulterys, Philip L; Dalai, Sudeb C; Katzenstein, David A

    2010-12-01

    Great progress has been made in understanding the pathogenesis, treatment, and transmission of HIV and the factors influencing the risk of mother-to-child transmission (MTCT). Many questions regarding the molecular evolution and genetic diversity of HIV in the context of MTCT remain unanswered. Further research to identify the selective factors governing which variants are transmitted, how the compartmentalization of HIV in different cells and tissues contributes to transmission, and the influence of host immunity, viral diversity, and recombination on MTCT may provide insight into new prevention strategies and the development of an effective HIV vaccine. Copyright © 2010 Elsevier Inc. All rights reserved.

  15. Evolution of networks and sequences in eukaryotic cell cycle control.

    Science.gov (United States)

    Cross, Frederick R; Buchler, Nicolas E; Skotheim, Jan M

    2011-12-27

    The molecular networks regulating the G1-S transition in budding yeast and mammals are strikingly similar in network structure. However, many of the individual proteins performing similar network roles appear to have unrelated amino acid sequences, suggesting either extremely rapid sequence evolution, or true polyphyly of proteins carrying out identical network roles. A yeast/mammal comparison suggests that network topology, and its associated dynamic properties, rather than regulatory proteins themselves may be the most important elements conserved through evolution. However, recent deep phylogenetic studies show that fungal and animal lineages are relatively closely related in the opisthokont branch of eukaryotes. The presence in plants of cell cycle regulators such as Rb, E2F and cyclins A and D, that appear lost in yeast, suggests cell cycle control in the last common ancestor of the eukaryotes was implemented with this set of regulatory proteins. Forward genetics in non-opisthokonts, such as plants or their green algal relatives, will provide direct information on cell cycle control in these organisms, and may elucidate the potentially more complex cell cycle control network of the last common eukaryotic ancestor.

  16. Complete Mitochondrial Genome Sequences of Chinese Indigenous Sheep with Different Tail Types and an Analysis of Phylogenetic Evolution in Domestic Sheep.

    Science.gov (United States)

    Fan, Hongying; Zhao, Fuping; Zhu, Caiye; Li, Fadi; Liu, Jidong; Zhang, Li; Wei, Caihong; Du, Lixin

    2016-05-01

    China has a long history of sheep (Ovis aries [O. aries]) breeding and an abundance of sheep genetic resources. Knowledge of the complete O. aries mitogenome should facilitate the study of the evolutionary history of the species. Therefore, the complete mitogenome of O. aries was sequenced and annotated. In order to characterize the mitogenomes of 3 Chinese sheep breeds (Altay sheep [AL], Shandong large-tailed sheep [SD], and small-tailed Hulun Buir sheep [sHL]), 19 sets of primers were employed to amplify contiguous, overlapping segments of the complete mitochondrial DNA (mtDNA) sequence of each breed. The sizes of the complete mitochondrial genomes of the sHL, AL, and SD breeds were 16,617 bp, 16,613 bp, and 16,613 bp, respectively. The mitochondrial genomes were deposited in the GenBank database with accession numbers KP702285 (AL sheep), KP981378 (SD sheep), and KP981380 (sHL sheep) respectively. The organization of the 3 analyzed sheep mitochondrial genomes was similar, with each consisting of 22 tRNA genes, 2 rRNA genes (12S rRNA and 16S rRNA), 13 protein-coding genes, and 1 control region (D-loop). The NADH dehydrogenase subunit 6 (ND6) and 8 tRNA genes were encoded on the light strand, whereas the rest of the mitochondrial genes were encoded on the heavy strand. The nucleotide skewness of the coding strands of the 3 analyzed mitogenomes was biased toward A and T. We constructed a phylogenetic tree using the complete mitogenomes of each type of sheep to allow us to understand the genetic relationships between Chinese breeds of O. aries and those developed and utilized in other countries. Our findings provide important information regarding the O. aries mitogenome and the evolutionary history of O. aries inside and outside China. In addition, our results provide a foundation for further exploration of the taxonomic status of O. aries.

  17. Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

    Science.gov (United States)

    Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

    2006-08-11

    Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.

  18. Clinical sequencing uncovers origins and evolution of Lassa virus

    Science.gov (United States)

    Andersen, Kristian G.; Shapiro, B. Jesse; Matranga, Christian B.; Sealfon, Rachel; Lin, Aaron E.; Moses, Lina M.; Folarin, Onikepe A.; Goba, Augustine; Odia, Ikponmwonsa; Ehiane, Philomena E.; Momoh, Mambu; England, Eleina M.; Winnicki, Sarah; Branco, Luis M.; Gire, Stephen K.; Phelan, Eric; Tariyal, Ridhi; Tewhey, Ryan; Omoniwa, Omowunmi; Fullah, Mohammed; Fonnie, Richard; Fonnie, Mbalu; Kanneh, Lansana; Jalloh, Simbirie; Gbakie, Michael; Saffa, Sidiki; Karbo, Kandeh; Gladden, Adrianne D.; Qu, James; Stremlau, Matthew; Nekoui, Mahan; Finucane, Hilary K.; Tabrizi, Shervin; Vitti, Joseph J.; Birren, Bruce; Fitzgerald, Michael; McCowan, Caryn; Ireland, Andrea; Berlin, Aaron M.; Bochicchio, James; Tazon-Vega, Barbara; Lennon, Niall J.; Ryan, Elizabeth M.; Bjornson, Zach; Milner, Danny A.; Lukens, Amanda K.; Broodie, Nisha; Rowland, Megan; Heinrich, Megan; Akdag, Marjan; Schieffelin, John S.; Levy, Danielle; Akpan, Henry; Bausch, Daniel G.; Rubins, Kathleen; McCormick, Joseph B.; Lander, Eric S.; Günther, Stephan; Hensley, Lisa; Okogbenin, Sylvanus; Schaffner, Stephen F.; Okokhere, Peter O.; Khan, S. Humarr; Grant, Donald S.; Akpede, George O.; Asogun, Danny A.; Gnirke, Andreas; Levin, Joshua Z.; Happi, Christian T.; Garry, Robert F.; Sabeti, Pardis C.

    2015-01-01

    Summary The 2013-2015 West African epidemic of Ebola virus disease (EVD) reminds us how little is known about biosafety level-4 viruses. Like Ebola virus, Lassa virus (LASV) can cause hemorrhagic fever with high case fatality rates. We generated a genomic catalog of almost 200 LASV sequences from clinical and rodent reservoir samples. We show that whereas the 2013-2015 EVD epidemic is fueled by human-to-human transmissions, LASV infections mainly result from reservoir-to-human infections. We elucidated the spread of LASV across West Africa and show that this migration was accompanied by changes in LASV genome abundance, fatality rates, codon adaptation, and translational efficiency. By investigating intrahost evolution, we found that mutations accumulate in epitopes of viral surface proteins, suggesting selection for immune escape. This catalog will serve as a foundation for the development of vaccines and diagnostics. PMID:26276630

  19. Clinical Sequencing Uncovers Origins and Evolution of Lassa Virus.

    Science.gov (United States)

    Andersen, Kristian G; Shapiro, B Jesse; Matranga, Christian B; Sealfon, Rachel; Lin, Aaron E; Moses, Lina M; Folarin, Onikepe A; Goba, Augustine; Odia, Ikponmwonsa; Ehiane, Philomena E; Momoh, Mambu; England, Eleina M; Winnicki, Sarah; Branco, Luis M; Gire, Stephen K; Phelan, Eric; Tariyal, Ridhi; Tewhey, Ryan; Omoniwa, Omowunmi; Fullah, Mohammed; Fonnie, Richard; Fonnie, Mbalu; Kanneh, Lansana; Jalloh, Simbirie; Gbakie, Michael; Saffa, Sidiki; Karbo, Kandeh; Gladden, Adrianne D; Qu, James; Stremlau, Matthew; Nekoui, Mahan; Finucane, Hilary K; Tabrizi, Shervin; Vitti, Joseph J; Birren, Bruce; Fitzgerald, Michael; McCowan, Caryn; Ireland, Andrea; Berlin, Aaron M; Bochicchio, James; Tazon-Vega, Barbara; Lennon, Niall J; Ryan, Elizabeth M; Bjornson, Zach; Milner, Danny A; Lukens, Amanda K; Broodie, Nisha; Rowland, Megan; Heinrich, Megan; Akdag, Marjan; Schieffelin, John S; Levy, Danielle; Akpan, Henry; Bausch, Daniel G; Rubins, Kathleen; McCormick, Joseph B; Lander, Eric S; Günther, Stephan; Hensley, Lisa; Okogbenin, Sylvanus; Schaffner, Stephen F; Okokhere, Peter O; Khan, S Humarr; Grant, Donald S; Akpede, George O; Asogun, Danny A; Gnirke, Andreas; Levin, Joshua Z; Happi, Christian T; Garry, Robert F; Sabeti, Pardis C

    2015-08-13

    The 2013-2015 West African epidemic of Ebola virus disease (EVD) reminds us of how little is known about biosafety level 4 viruses. Like Ebola virus, Lassa virus (LASV) can cause hemorrhagic fever with high case fatality rates. We generated a genomic catalog of almost 200 LASV sequences from clinical and rodent reservoir samples. We show that whereas the 2013-2015 EVD epidemic is fueled by human-to-human transmissions, LASV infections mainly result from reservoir-to-human infections. We elucidated the spread of LASV across West Africa and show that this migration was accompanied by changes in LASV genome abundance, fatality rates, codon adaptation, and translational efficiency. By investigating intrahost evolution, we found that mutations accumulate in epitopes of viral surface proteins, suggesting selection for immune escape. This catalog will serve as a foundation for the development of vaccines and diagnostics. VIDEO ABSTRACT. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Implications of the plastid genome sequence of typha (typhaceae, poales) for understanding genome evolution in poaceae.

    Science.gov (United States)

    Guisinger, Mary M; Chumley, Timothy W; Kuehl, Jennifer V; Boore, Jeffrey L; Jansen, Robert K

    2010-02-01

    Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes.

  1. Conservation of tubulin-binding sequences in TRPV1 throughout evolution.

    Directory of Open Access Journals (Sweden)

    Puspendu Sardar

    Full Text Available Transient Receptor Potential Vanilloid sub type 1 (TRPV1, commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important.Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA. Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function.Our analysis identifies the regions of TRPV1, which are important for structure-function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1 near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol®-induced neuropathy.

  2. Nonlinear analysis of biological sequences

    Energy Technology Data Exchange (ETDEWEB)

    Torney, D.C.; Bruno, W.; Detours, V. [and others

    1998-11-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at the Los Alamos National Laboratory (LANL). The main objectives of this project involved deriving new capabilities for analyzing biological sequences. The authors focused on tabulating the statistical properties exhibited by Human coding DNA sequences and on techniques of inferring the phylogenetic relationships among protein sequences related by descent.

  3. Initial sequencing and comparative analysis of the mouse genome

    Energy Technology Data Exchange (ETDEWEB)

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  4. Analysis of the complete DNA sequence of the temperate bacteriophage TP901-1: Evolution, structure, and genome organization of lactococcal bacteriophages

    DEFF Research Database (Denmark)

    Brøndsted, Lone; Østergaard, Solvej; Pedersen, Margit

    2001-01-01

    A complete analysis of the entire genome of the temperate lactococcal bacteriophage TP901-1 has been performed and the function of 21 of 56 TP901-1-encoded ORFs has been assigned. This knowledge has been used to propose 10 functional modules each responsible for specific functions during bacterio......A complete analysis of the entire genome of the temperate lactococcal bacteriophage TP901-1 has been performed and the function of 21 of 56 TP901-1-encoded ORFs has been assigned. This knowledge has been used to propose 10 functional modules each responsible for specific functions during...... bacteriophage TP901-1 proliferation. Short regions of microhomology in intergenic regions present in several lactococcal bacteriophages and chromosomal fragments of Lactococcus lactis are suggested to be points of exchange of genetic material through homologous recombination. Our results indicate that TP901...... region of the TP901-1 genome were more homologous to proteins encoded by phages infecting gram-positive hosts other than L. lactis. This protein homology argues for the occurrence of horizontal genetic exchange among these bacteriophages and indicates that they have access to a common gene pool....

  5. Analysis of human collagen sequences.

    Science.gov (United States)

    Nassa, Manisha; Anand, Pracheta; Jain, Aditi; Chhabra, Aastha; Jaiswal, Astha; Malhotra, Umang; Rani, Vibha

    2012-01-01

    The extracellular matrix is fast emerging as important component mediating cell-cell interactions, along with its established role as a scaffold for cell support. Collagen, being the principal component of extracellular matrix, has been implicated in a number of pathological conditions. However, collagens are complex protein structures belonging to a large family consisting of 28 members in humans; hence, there exists a lack of in depth information about their structural features. Annotating and appreciating the functions of these proteins is possible with the help of the numerous biocomputational tools that are currently available. This study reports a comparative analysis and characterization of the alpha-1 chain of human collagen sequences. Physico-chemical, secondary structural, functional and phylogenetic classification was carried out, based on which, collagens 12, 14 and 20, which belong to the FACIT collagen family, have been identified as potential players in diseased conditions, owing to certain atypical properties such as very high aliphatic index, low percentage of glycine and proline residues and their proximity in evolutionary history. These collagen molecules might be important candidates to be investigated further for their role in skeletal disorders.

  6. Accelerated Evolution of Conserved Noncoding Sequences in theHuman Genome

    Energy Technology Data Exchange (ETDEWEB)

    Prambhakar, Shyam; Noonan, James P.; Paabo, Svante; Rubin, EdwardM.

    2006-07-06

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detect"cryptic" functional elements, which are too weakly conserved amongmammals to distinguish from nonfunctional DNA. To address this problem,we explored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  7. Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species

    Science.gov (United States)

    Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Yu, Yeisoo; Yang, Kiwoung; Choi, Beom-Soon; Koh, Hee-Jong; Waminal, Nomar Espinosa; Choi, Hong-Il; Kim, Nam-Hoon; Jang, Woojong; Park, Hyun-Seung; Lee, Jonghoon; Lee, Hyun Oh; Joh, Ho Jun; Lee, Hyeon Ju; Park, Jee Young; Perumal, Sampath; Jayakodi, Murukarthick; Lee, Yun Sun; Kim, Backki; Copetti, Dario; Kim, Soonok; Kim, Sunggil; Lim, Ki-Byung; Kim, Young-Dong; Lee, Jungho; Cho, Kwang-Su; Park, Beom-Seok; Wing, Rod A.; Yang, Tae-Jin

    2015-01-01

    Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis. PMID:26506948

  8. Physical localization of the 18S-5.8S-26S rDNA and sequence analysis of ITS regions in Thinopyrum ponticum (Poaceae: Triticeae): implications for concerted evolution.

    Science.gov (United States)

    Li, Dayong; Zhang, Xueyong

    2002-10-01

    Fluorescence in situ hybridization was used in Thinopyrum ponticum, a decaploid species, and its related diploid species, to investigate the distribution of the 18S-5.8S-26S rDNA. The distribution of rDNA was similar in all three diploid species (Th. bessarabicum, Th. elongatum and Pseudoroegneria stipifolia). Two pairs of loci were observed in each somatic cell at metaphase and interphase. One pair was located near the terminal end and the other in the interstitial regions of the short arms of one pair of chromosomes. However, all of the major loci in Th. ponticum were located on the terminal end of the short arms of chromosomes, and one chromosome had only one major locus. The maximum number of major loci detected on metaphase spreads was 20, which was the sum of that of its progenitors. The interstitial loci that exist in the possible diploid genome donor species were probably 'lost' during the evolutionary process of the decaploid species. A number of minor loci were also detected on whole regions of two pairs of homologous chromosomes. These results suggested that the position of rDNA loci in the Triticeae might be changeable rather than fixed. Positional changes of 18S-5.8S-26S rDNA loci between Th. ponticum and its candidate genome donors indicate that it is almost impossible to find a genome in the polyploid species that is completely identical to that of its diploid donors. The possible evolutionary significance of the distribution of the rDNA is also discussed. Internal transcribed spacer (ITS) regions of nuclear DNA in Th. ponticum were investigated by PCR amplification and sequencing. The sequence data from five positive clones selected at random, together with restriction site analysis, indicated that the ITS repeated units are nearly homogeneous in this autoallodecapolypoid species. Combined with in situ hybridization results, the data led to the conclusion that the ITS region has experienced interlocus as well as intralocus concerted evolution

  9. Physical Localization of the 18S‐5·8S‐26S rDNA and Sequence Analysis of ITS Regions in Thinopyrum ponticum (Poaceae: Triticeae): Implications for Concerted Evolution

    Science.gov (United States)

    LI, DAYONG; ZHANG, XUEYONG

    2002-01-01

    Fluorescence in situ hybridization was used in Thinopyrum ponticum, a decaploid species, and its related diploid species, to investigate the distribution of the 18S‐5·8S‐26S rDNA. The distribution of rDNA was similar in all three diploid species (Th. bessarabicum, Th. elongatum and Pseudoroegneria stipifolia). Two pairs of loci were observed in each somatic cell at metaphase and interphase. One pair was located near the terminal end and the other in the interstitial regions of the short arms of one pair of chromosomes. However, all of the major loci in Th. ponticum were located on the terminal end of the short arms of chromosomes, and one chromosome had only one major locus. The maximum number of major loci detected on metaphase spreads was 20, which was the sum of that of its progenitors. The interstitial loci that exist in the possible diploid genome donor species were probably ‘lost’ during the evolutionary process of the decaploid species. A number of minor loci were also detected on whole regions of two pairs of homologous chromosomes. These results suggested that the position of rDNA loci in the Triticeae might be changeable rather than fixed. Positional changes of 18S‐5·8S‐26S rDNA loci between Th. ponticum and its candidate genome donors indicate that it is almost impossible to find a genome in the polyploid species that is completely identical to that of its diploid donors. The possible evolutionary significance of the distribution of the rDNA is also discussed. Internal transcribed spacer (ITS) regions of nuclear DNA in Th. ponticum were investigated by PCR amplification and sequencing. The sequence data from five positive clones selected at random, together with restriction site analysis, indicated that the ITS repeated units are nearly homogeneous in this autoallodecapolypoid species. Combined with in situ hybridization results, the data led to the conclusion that the ITS region has experienced interlocus as well as intralocus concerted

  10. On Statistical Modeling of Sequencing Noise in High Depth Data to Assess Tumor Evolution

    Science.gov (United States)

    Rabadan, Raul; Bhanot, Gyan; Marsilio, Sonia; Chiorazzi, Nicholas; Pasqualucci, Laura; Khiabanian, Hossein

    2017-12-01

    One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerge from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.

  11. Whole genome sequence analysis of Mycobacterium suricattae.

    Science.gov (United States)

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Sequence tolerance of the phage lambda PRM promoter: implications for evolution of gene regulatory circuitry.

    Science.gov (United States)

    Michalowski, Christine B; Short, Megan D; Little, John W

    2004-12-01

    Much of the gene regulatory circuitry of phage lambda centers on a complex region called the O(R) region. This approximately 100-bp region is densely packed with regulatory sites, including two promoters and three repressor-binding sites. The dense packing of this region is likely to impose severe constraints on its ability to change during evolution, raising the question of how the specific arrangement of sites and their exact sequences could evolve to their present form. Here we ask whether the sequence of a cis-acting site can be widely varied while retaining its function; if it can, evolution could proceed by a larger number of paths. To help address this question, we developed a lambda cloning vector that allowed us to clone fragments spanning the O(R) region. By using this vector, we carried out intensive mutagenesis of the P(RM) promoter, which drives expression of CI repressor and is activated by CI itself. We made a pool of fragments in which 8 of the 12 positions in the -35 and -10 regions were randomized and cloned this pool into the vector, making a pool of P(RM) variant phage. About 10% of the P(RM) variants were able to lysogenize, suggesting that the lambda regulatory circuitry is compatible with a wide range of P(RM) sequences. Analysis of several of these phages indicated a range of behaviors in prophage induction. Several isolates had induction properties similar to those of the wild type, and their promoters resembled the wild type in their responses to CI. We term this property of different sequences allowing roughly equivalent function "sequence tolerance " and discuss its role in the evolution of gene regulatory circuitry.

  13. Sequencing and comparative analysis of the gorilla MHC genomic sequence

    Science.gov (United States)

    Wilming, Laurens G.; Hart, Elizabeth A.; Coggill, Penny C.; Horton, Roger; Gilbert, James G. R.; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L.

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC. PMID:23589541

  14. The Intervening Sequence of Coxiella burnetii: Characterization and Evolution.

    Science.gov (United States)

    Warrier, Indu; Walter, Mathias C; Frangoulidis, Dimitrios; Raghavan, Rahul; Hicks, Linda D; Minnick, Michael F

    2016-01-01

    The intervening sequence (IVS) of Coxiella burnetii, the agent of Q fever, is a 428-nt selfish genetic element located in helix 45 of the precursor 23S rRNA. The IVS element, in turn, contains an ORF that encodes a hypothetical ribosomal S23 protein (S23p). Although S23p can be synthesized in vitro in the presence of an engineered E. coli promoter and ribosome binding site, results suggest that the protein is not synthesized in vivo. In spite of a high degree of IVS conservation among different strains of C. burnetii, the region immediately upstream of the S23p start codon is prone to change, and the S23p-encoding ORF is evidently undergoing reductive evolution. We determined that IVS excision from 23S rRNA was mediated by RNase III, and IVS RNA was rapidly degraded, thereafter. Levels of the resulting 23S rRNA fragments that flank the IVS, F1 (~1.2 kb) and F2 (~1.7 kb), were quantified over C. burnetii's logarithmic growth phase (1-5 d). Results showed that 23S F1 quantities were consistently higher than those of F2 and 16S rRNA. The disparity between levels of the two 23S rRNA fragments following excision of IVS is an interesting phenomenon of unknown significance. Based upon phylogenetic analyses, IVS was acquired through horizontal transfer after C. burnetii's divergence from an ancestral bacterium and has been subsequently maintained by vertical transfer. The widespread occurrence, maintenance and conservation of the IVS in C. burnetii imply that it plays an adaptive role or has a neutral effect on fitness.

  15. De novo transcriptome assembly of Zanthoxylum bungeanum using Illumina sequencing for evolutionary analysis and simple sequence repeat marker development

    OpenAIRE

    Feng, Shijing; Zhao, Lili; Liu, Zhenshan; Liu, Yulin; Yang, Tuxi; Wei, Anzhi

    2017-01-01

    Zanthoxylum, an ancient economic crop in Asia, has a satisfying aromatic taste and immense medicinal values. A lack of genomic information and genetic markers has limited the evolutionary analysis and genetic improvement of Zanthoxylum species and their close relatives. To better understand the evolution, domestication, and divergence of Zanthoxylum, we present a de novo transcriptome analysis of an elite cultivar of Z. bungeanum using Illumina sequencing; we then developed simple sequence re...

  16. Analysis and Annotation of Nucleic Acid Sequence

    Energy Technology Data Exchange (ETDEWEB)

    David J. States

    1998-08-01

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  17. Universal sequence replication, reversible polymerization and early functional biopolymers: a model for the initiation of prebiotic sequence evolution.

    Directory of Open Access Journals (Sweden)

    Sara Imari Walker

    Full Text Available Many models for the origin of life have focused on understanding how evolution can drive the refinement of a preexisting enzyme, such as the evolution of efficient replicase activity. Here we present a model for what was, arguably, an even earlier stage of chemical evolution, when polymer sequence diversity was generated and sustained before, and during, the onset of functional selection. The model includes regular environmental cycles (e.g. hydration-dehydration cycles that drive polymers between times of replication and functional activity, which coincide with times of different monomer and polymer diffusivity. Template-directed replication of informational polymers, which takes place during the dehydration stage of each cycle, is considered to be sequence-independent. New sequences are generated by spontaneous polymer formation, and all sequences compete for a finite monomer resource that is recycled via reversible polymerization. Kinetic Monte Carlo simulations demonstrate that this proposed prebiotic scenario provides a robust mechanism for the exploration of sequence space. Introduction of a polymer sequence with monomer synthetase activity illustrates that functional sequences can become established in a preexisting pool of otherwise non-functional sequences. Functional selection does not dominate system dynamics and sequence diversity remains high, permitting the emergence and spread of more than one functional sequence. It is also observed that polymers spontaneously form clusters in simulations where polymers diffuse more slowly than monomers, a feature that is reminiscent of a previous proposal that the earliest stages of life could have been defined by the collective evolution of a system-wide cooperation of polymer aggregates. Overall, the results presented demonstrate the merits of considering plausible prebiotic polymer chemistries and environments that would have allowed for the rapid turnover of monomer resources and for

  18. Seventeen new complete mtDNA sequences reveal extensive mitochondrial genome evolution within the Demospongiae.

    Directory of Open Access Journals (Sweden)

    Xiujuan Wang

    Full Text Available Two major transitions in animal evolution--the origins of multicellularity and bilaterality--correlate with major changes in mitochondrial DNA (mtDNA organization. Demosponges, the largest class in the phylum Porifera, underwent only the first of these transitions and their mitochondrial genomes display a peculiar combination of ancestral and animal-specific features. To get an insight into the evolution of mitochondrial genomes within the Demospongiae, we determined 17 new mtDNA sequences from this group and analyzing them with five previously published sequences. Our analysis revealed that all demosponge mtDNAs are 16- to 25-kbp circular molecules, containing 13-15 protein genes, 2 rRNA genes, and 2-27 tRNA genes. All but four pairs of sampled genomes had unique gene orders, with the number of shared gene boundaries ranging from 1 to 41. Although most demosponge species displayed low rates of mitochondrial sequence evolution, a significant acceleration in evolutionary rates occurred in the G1 group (orders Dendroceratida, Dictyoceratida, and Verticillitida. Large variation in mtDNA organization was also observed within the G0 group (order Homosclerophorida including gene rearrangements, loss of tRNA genes, and the presence of two introns in Plakortis angulospiculatus. While introns are rare in modern-day demosponge mtDNA, we inferred that at least one intron was present in cox1 of the common ancestor of all demosponges. Our study uncovered an extensive mitochondrial genomic diversity within the Demospongiae. Although all sampled mitochondrial genomes retained some ancestral features, including a minimally modified genetic code, conserved structures of tRNA genes, and presence of multiple non-coding regions, they vary considerably in their size, gene content, gene order, and the rates of sequence evolution. Some of the changes in demosponge mtDNA, such as the loss of tRNA genes and the appearance of hairpin-containing repetitive elements

  19. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies.

  20. Getter bed for tritium handling: temperature evolution during loading sequences

    Energy Technology Data Exchange (ETDEWEB)

    Ghezzi, F. [Istituto di Fisica del Plasma, Associazione Euratome/ENEA/CNR, Milano (Italy)

    1998-07-01

    The time evolution of the temperature of the alloy in getter beds, during hydrogen loadings, was studied. In order to describe the temperature evolution was developed a model with two times constants, the model was found to fit well the experimental data. This paper presents a method for investigating on the thermal capacity of the bed involved on the heat transfer during the loading. A practical application is given as example. (author)

  1. The evolution and utility of ribosomal ITS sequences in Bambusinae ...

    Indian Academy of Sciences (India)

    The molecular systematics of Bambusinae and related species were recently assessed by different teams using independently generated ITS sequences, and the results disagreed in some remarkable features. Here we compared the ITS sequences of the members of Bambusa s. l., the genera Dendrocalamus, Dinochloa, ...

  2. Optimizing cancer genome sequencing and analysis

    Science.gov (United States)

    Griffith, Malachi; Miller, Christopher A.; Griffith, Obi L.; Krysiak, Kilannin; Skidmore, Zachary L.; Ramu, Avinash; Walker, Jason R.; Dang, Ha X.; Trani, Lee; Larson, David E.; Demeter, Ryan T.; Wendl, Michael C.; McMichael, Joshua F.; Austin, Rachel E.; Magrini, Vincent; McGrath, Sean D.; Ly, Amy; Kulkarni, Shashikant; Cordes, Matthew G.; Fronick, Catrina C.; Fulton, Robert S.; Maher, Christopher A.; Ding, Li; Klco, Jeffery M.; Mardis, Elaine R.; Ley, Timothy J.; Wilson, Richard K.

    2015-01-01

    Summary Tumors are typically sequenced to depths of 75–100× (exome) or 30–50× (whole genome). We demonstrate that current sequencing paradigms are inadequate for tumors that are impure, aneuploid or clonally heterogeneous. To reassess optimal sequencing strategies, we performed ultra-deep (up to ~312×) whole genome sequencing (WGS) and exome capture (up to ~433×) of a primary acute myeloid leukemia, its subsequent relapse, and a matched normal skin sample. We tested multiple alignment and variant calling algorithms and validated ~200,000 putative SNVs by sequencing them to depths of ~1,000×. Additional targeted sequencing provided over 10,000× coverage and ddPCR assays provided up to ~250,000× sampling of selected sites. We evaluated the effects of different library generation approaches, depth of sequencing, and analysis strategies on the ability to effectively characterize a complex tumor. This dataset, representing the most comprehensively sequenced tumor described to date, will serve as an invaluable community resource (dbGaP accession id phs000159). PMID:26645048

  3. Identifying selection in the within-host evolution of influenza using viral sequence data.

    Directory of Open Access Journals (Sweden)

    Christopher J R Illingworth

    2014-07-01

    Full Text Available The within-host evolution of influenza is a vital component of its epidemiology. A question of particular interest is the role that selection plays in shaping the viral population over the course of a single infection. We here describe a method to measure selection acting upon the influenza virus within an individual host, based upon time-resolved genome sequence data from an infection. Analysing sequence data from a transmission study conducted in pigs, describing part of the haemagglutinin gene (HA1 of an influenza virus, we find signatures of non-neutrality in six of a total of sixteen infections. We find evidence for both positive and negative selection acting upon specific alleles, while in three cases, the data suggest the presence of time-dependent selection. In one infection we observe what is potentially a specific immune response against the virus; a non-synonymous mutation in an epitope region of the virus is found to be under initially positive, then strongly negative selection. Crucially, given the lack of homologous recombination in influenza, our method accounts for linkage disequilibrium between nucleotides at different positions in the haemagglutinin gene, allowing for the analysis of populations in which multiple mutations are present at any given time. Our approach offers a new insight into the dynamics of influenza infection, providing a detailed characterisation of the forces that underlie viral evolution.

  4. Directed Evolution of DNA Polymerases for Next Generation Sequencing

    Science.gov (United States)

    Leconte, Aaron M.; Patel, Maha P.; Sass, Lauryn E.; McInerney, Peter; Jarosz, Mirna; Kung, Li; Bowers, Jayson L.; Buzby, Philip R.; Efcavitch, J. William; Romesberg, Floyd E.

    2011-01-01

    We present the application of an activity-based phage display method to identify DNA polymerases tailored for next generation sequencing applications. Using this approach, we identify a mutant of Taq DNA polymerase that incorporates the fluorophore-labeled dA, dT, dC, and dG substrates ~50 to 400-fold more efficiently into scarred primers in solution and that also demonstrates significantly improved performance under actual sequencing conditions. PMID:20629059

  5. Sequencing and Analysis of Neanderthal Genomic DNA

    OpenAIRE

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo, Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library a...

  6. Reconstructing ancestral genomic sequences by co-evolution: formal definitions, computational issues, and biological examples.

    Science.gov (United States)

    Tuller, Tamir; Birin, Hadas; Kupiec, Martin; Ruppin, Eytan

    2010-09-01

    The inference of ancestral genomes is a fundamental problem in molecular evolution. Due to the statistical nature of this problem, the most likely or the most parsimonious ancestral genomes usually include considerable error rates. In general, these errors cannot be abolished by utilizing more exhaustive computational approaches, by using longer genomic sequences, or by analyzing more taxa. In recent studies, we showed that co-evolution is an important force that can be used for significantly improving the inference of ancestral genome content. In this work we formally define a computational problem for the inference of ancestral genome content by co-evolution. We show that this problem is NP-hard and hard to approximate and present both a Fixed Parameter Tractable (FPT) algorithm, and heuristic approximation algorithms for solving it. The running time of these algorithms on simulated inputs with hundreds of protein families and hundreds of co-evolutionary relations was fast (up to four minutes) and it achieved an approximation ratio of biological analysis revealed various pieces of evidence that support the biological plausibility of the new solutions. In addition, we showed that our approach reconstructs missing values at the leaves of the Fungi evolutionary tree better than ML or MP.

  7. Mathematical Analysis of Evolution, Information, and Complexity

    CERN Document Server

    Arendt, Wolfgang

    2009-01-01

    Mathematical Analysis of Evolution, Information, and Complexity deals with the analysis of evolution, information and complexity. The time evolution of systems or processes is a central question in science, this text covers a broad range of problems including diffusion processes, neuronal networks, quantum theory and cosmology. Bringing together a wide collection of research in mathematics, information theory, physics and other scientific and technical areas, this new title offers elementary and thus easily accessible introductions to the various fields of research addressed in the book.

  8. Genome sequencing of the extinct Eurasian wild aurochs illuminates the phylogeography and evolution of cattle

    Science.gov (United States)

    Interrogation of modern and ancient bovine genome sequences provides a valuable model to study the evolution of cattle. Here, we analyse the first complete wild aurochs (Bos primigenius) genome sequence using DNA extracted from a ~ 6,750 year-old humerus bone retrieved from a cave site in Derbyshire...

  9. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    complex subclonal architectures comprising distinct subclones only found in geographically distinct regions of the tumors. The metastatic potential of the tumor is acquired early in the tumor evolution, as indicated by the lymph node sharing the majority of the mutations with the tumor biopsies, while...... rarely acquiring novel mutations that are specific for the metastasis. Conclusion: Ultra-deep sequencing of multiple biopsies from OSCC and metastasis enables detection of subclonal structure and genomic evolution. The metastatic potential of OSCC is acquired early in the tumor evolution, and our results...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...

  10. [Genomic structure of the autotetraploid oat species Avena macrostachya inferred from comparative analysis of the ITS1 and ITS2 sequences: on the oat karyotype evolution during the early stages of the Avena species divergence].

    Science.gov (United States)

    Rodionov, A V; Tiupa, N B; Kim, E S; Machs, E M; Loskutov, I G

    2005-05-01

    To examine the genomic structure of Avena macrostachya, internal transcribed spacers, ITS1 and ITS2, as well as nuclear 5.8S tRNA genes from three oat species with AsAs karyotype (A. wiestii, A. hirtula, and A. atlantica), and those from A. longiglumis (AlAl), A. canariensis (AcAc), A. ventricosa (CvCv), A. pilosa, and A. clauda (CpCp) were sequenced. All species of the genus Avena examined represented a monophyletic group (bootstrap index = 98), within which two branches, i.e., species with A- and C-genomes, were distinguished (bootstrap indices = 100). The subject of our study, A. macrostachya, albeit belonging to the phylogenetic branch of C-genome oat species (karyotype with submetacentic and subacrocentric chromosomes), has preserved an isobrachyal karyotype, (i.e., that containing metacentric chromosomes), probably typical of the common Avena ancestor. It was suggested to classify the A. macrostachya genome as a specific form of C-genome, Cm-genome. Among the species from other genera studied, Arrhenatherum elatius was found to be the closest to Avena in ITS1 and ITS structure. Phylogenetic relationships between Avena and Helictotrichon remain intriguingly uncertain. The HPR389153 sequence from H. pratense genome was closest to the ITS1 sequences specific to the Avena A-genomes (p-distance = 0.0237), while the differences of this sequence from the ITS1 of A. macrostachya reached 0.1221. On the other hand, HAD389117 from H. adsurgens was close to the ITS1 specific to Avena C-genomes (p-distance = 0.0189), while its differences from the A-genome specific ITS1 sequences reached 0.1221. It seems likely that the appearance of highly polyploid (2n = 12-21x) species of H. pratense and H. adsurgens could be associated with interspecific hybridization involving Mediterranean oat species carrying A- and C-genomes. A hypothesis on the pathways of Avena chromosomes evolution during the early stages the oat species divergence is proposed.

  11. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.

    Science.gov (United States)

    Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

    2016-07-12

    Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. Biased Gene Conversion and GC-Content Evolution in the Coding Sequences of Reptiles and Vertebrates

    Science.gov (United States)

    Figuet, Emeric; Ballenghien, Marion; Romiguier, Jonathan; Galtier, Nicolas

    2015-01-01

    Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins. PMID:25527834

  13. Exploring the correlations between sequence evolution rate and ...

    Indian Academy of Sciences (India)

    ... insertions/deletions in functional regions. These can rapidly arise and sweep to fixation faster than predicted from a lineage's sequence neutral substitution rate, enabling species to leapfrog between phenotypic `islands'. We suggest research directions that could illuminate mechanisms behind the functional diversity we ...

  14. Focused Evolution of HIV-1 Neutralizing Antibodies Revealed by Structures and Deep Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Xueling; Zhou, Tongqing; Zhu, Jiang; Zhang, Baoshan; Georgiev, Ivelin; Wang, Charlene; Chen, Xuejun; Longo, Nancy S.; Louder, Mark; McKee, Krisha; O’Dell, Sijy; Perfetto, Stephen; Schmidt, Stephen D.; Shi, Wei; Wu, Lan; Yang, Yongping; Yang, Zhi-Yong; Yang, Zhongjia; Zhang, Zhenhai; Bonsignori, Mattia; Crump, John A.; Kapiga, Saidi H.; Sam, Noel E.; Haynes, Barton F.; Simek, Melissa; Burton, Dennis R.; Koff, Wayne C.; Doria-Rose, Nicole A.; Connors, Mark; Mullikin, James C.; Nabel, Gary J.; Roederer, Mario; Shapiro, Lawrence; Kwong, Peter D.; Mascola, John R. (Tumaini); (NIH); (Duke); (Kilimanjaro Repro.); (IAVI)

    2013-03-04

    Antibody VRC01 is a human immunoglobulin that neutralizes about 90% of HIV-1 isolates. To understand how such broadly neutralizing antibodies develop, we used x-ray crystallography and 454 pyrosequencing to characterize additional VRC01-like antibodies from HIV-1-infected individuals. Crystal structures revealed a convergent mode of binding for diverse antibodies to the same CD4-binding-site epitope. A functional genomics analysis of expressed heavy and light chains revealed common pathways of antibody-heavy chain maturation, confined to the IGHV1-2*02 lineage, involving dozens of somatic changes, and capable of pairing with different light chains. Broadly neutralizing HIV-1 immunity associated with VRC01-like antibodies thus involves the evolution of antibodies to a highly affinity-matured state required to recognize an invariant viral structure, with lineages defined from thousands of sequences providing a genetic roadmap of their development.

  15. Gene-Based Sequence Diversity Analysis of Field Pea (Pisum)

    Science.gov (United States)

    Jing, Runchun; Johnson, Richard; Seres, Andrea; Kiss, Gyorgy; Ambrose, Mike J.; Knox, Maggie R.; Ellis, T. H. Noel; Flavell, Andrew J.

    2007-01-01

    Sequence diversity of 39 dispersed gene loci was analyzed in 48 diverse individuals representative of the genus Pisum. The different genes show large variation in diversity parameters, suggesting widely differing levels of selection and a high overall diversity level for the species. The data set yields a genetic diversity tree whose deep branches, involving wild samples, are preserved in a tree derived from a polymorphic retrotransposon insertions in an identical sample set. Thus, gene regions and intergenic “junk DNA” share a consistent picture for the genomic diversity of Pisum, despite low linkage disequilibrium in wild and landrace germplasm, which might be expected to allow independent evolution of these very different DNA classes. Additional lines of evidence indicate that recombination has shuffled gene haplotypes efficiently within Pisum, despite its high level of inbreeding and widespread geographic distribution. Trees derived from individual gene loci show marked differences from each other, and genetic distance values between sample pairs show high standard deviations. Sequence mosaic analysis of aligned sequences identifies nine loci showing evidence for intragenic recombination. Lastly, phylogenetic network analysis confirms the non-treelike structure of Pisum diversity and indicates the major germplasm classes involved. Overall, these data emphasize the artificiality of simple tree structures for representing genomic sequence variation within Pisum and emphasize the need for fine structure haplotype analysis to accurately define the genetic structure of the species. PMID:18073431

  16. The evolution processes of DNA sequences, languages and carols

    Science.gov (United States)

    Hauck, Jürgen; Henkel, Dorothea; Mika, Klaus

    2001-04-01

    The sequences of bases A, T, C and G of about 100 enolase, secA and cytochrome DNA were analyzed for attractive or repulsive interactions by the numbers T 1,T 2,T 3; r of nearest, next-nearest and third neighbor bases of the same kind and the concentration r=other bases/analyzed base. The area of possible T1, T2 values is limited by the linear borders T 2=2T 1-2, T 2=0 or T1=0 for clustering, attractive or repulsive interactions and the border T2=-2 T1+2(2- r) for a variation from repulsive to attractive interactions at r⩽2. Clustering is preferred by most bases in sequences of enolases and secA’ s. Major deviations with repulsive interactions of some bases are observed for archaea bacteria in secA and for highly developed animals and the human species in enolase sequences. The borders of the structure map for enthalpy stabilized structures with maximum interactions are approached in few cases. Most letters of the natural languages and some music notes are at the borders of the structure map.

  17. Information theory applications for biological sequence analysis.

    Science.gov (United States)

    Vinga, Susana

    2014-05-01

    Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area.

  18. Selection pressure from neutralizing antibodies drives sequence evolution during acute infection with hepatitis C virus.

    Science.gov (United States)

    Dowd, Kimberly A; Netski, Dale M; Wang, Xiao-Hong; Cox, Andrea L; Ray, Stuart C

    2009-06-01

    Despite recent characterization of hepatitis C virus-specific neutralizing antibodies, it is not clear to what extent immune pressure from neutralizing antibodies drives viral sequence evolution in vivo. This lack of understanding is particularly evident in acute infection, the phase when elimination or persistence of viral replication is determined and during which the importance of the humoral immune response has been largely discounted. We analyzed envelope glycoprotein sequence evolution and neutralization of sequential autologous hepatitis C virus pseudoparticles in 8 individuals throughout acute infection. Amino acid substitutions occurred throughout the envelope genes, primarily within the hypervariable region 1 of E2. When individualized pseudoparticles expressing sequential envelope sequences were used to measure neutralization by autologous sera, antibodies neutralizing earlier sequence variants were detected at earlier time points than antibodies neutralizing later variants, indicating clearance and evolution of viral variants in response to pressure from neutralizing antibodies. To demonstrate the effects of amino acid substitution on neutralization, site-directed mutagenesis of a pseudoparticle envelope sequence revealed amino acid substitutions in hypervariable region 1 that were responsible for a dramatic decrease in neutralization sensitivity over time. In addition, high-titer neutralizing antibodies peaked at the time of viral clearance in all spontaneous resolvers, whereas chronically evolving subjects displayed low-titer or absent neutralizing antibodies throughout early acute infection. These findings indicate that, during acute hepatitis C virus infection in vivo, virus-specific neutralizing antibodies drive sequence evolution and, in some individuals, play a role in determining the outcome of infection.

  19. PhyloSim - Monte Carlo simulation of sequence evolution in the R statistical computing environment

    Directory of Open Access Journals (Sweden)

    Massingham Tim

    2011-04-01

    Full Text Available Abstract Background The Monte Carlo simulation of sequence evolution is routinely used to assess the performance of phylogenetic inference methods and sequence alignment algorithms. Progress in the field of molecular evolution fuels the need for more realistic and hence more complex simulations, adapted to particular situations, yet current software makes unreasonable assumptions such as homogeneous substitution dynamics or a uniform distribution of indels across the simulated sequences. This calls for an extensible simulation framework written in a high-level functional language, offering new functionality and making it easy to incorporate further complexity. Results PhyloSim is an extensible framework for the Monte Carlo simulation of sequence evolution, written in R, using the Gillespie algorithm to integrate the actions of many concurrent processes such as substitutions, insertions and deletions. Uniquely among sequence simulation tools, PhyloSim can simulate arbitrarily complex patterns of rate variation and multiple indel processes, and allows for the incorporation of selective constraints on indel events. User-defined complex patterns of mutation and selection can be easily integrated into simulations, allowing PhyloSim to be adapted to specific needs. Conclusions Close integration with R and the wide range of features implemented offer unmatched flexibility, making it possible to simulate sequence evolution under a wide range of realistic settings. We believe that PhyloSim will be useful to future studies involving simulated alignments.

  20. Digital image sequence processing, compression, and analysis

    CERN Document Server

    Reed, Todd R

    2004-01-01

    IntroductionTodd R. ReedCONTENT-BASED IMAGE SEQUENCE REPRESENTATIONPedro M. Q. Aguiar, Radu S. Jasinschi, José M. F. Moura, andCharnchai PluempitiwiriyawejTHE COMPUTATION OF MOTIONChristoph Stiller, Sören Kammel, Jan Horn, and Thao DangMOTION ANALYSIS AND DISPLACEMENT ESTIMATION IN THE FREQUENCY DOMAINLuca Lucchese and Guido Maria CortelazzoQUALITY OF SERVICE ASSESSMENT IN NEW GENERATION WIRELESS VIDEO COMMUNICATIONSGaetano GiuntaERROR CONCEALMENT IN DIGITAL VIDEOFrancesco G.B. De NataleIMAGE SEQUENCE RESTORATION: A WIDER PERSPECTIVEAnil KokaramVIDEO SUMMARIZATIONCuneyt M. Taskiran and Edward

  1. Mulan: Multiple-Sequence Local Alignment and Visualization for Studying Function and Evolution

    Energy Technology Data Exchange (ETDEWEB)

    Ovcharenko, I; Loots, G; Giardine, B; Hou, M; Ma, J; Hardison, R; Stubbs, L; Miller, W

    2004-07-14

    Multiple sequence alignment analysis is a powerful approach for understanding phylogenetic relationships, annotating genes and detecting functional regulatory elements. With a growing number of partly or fully sequenced vertebrate genomes, effective tools for performing multiple comparisons are required to accurately and efficiently assist biological discoveries. Here we introduce Mulan (http://mulan.dcode.org/), a novel method and a network server for comparing multiple draft and finished-quality sequences to identify functional elements conserved over evolutionary time. Mulan brings together several novel algorithms: the tba multi-aligner program for rapid identification of local sequence conservation and the multiTF program for detecting evolutionarily conserved transcription factor binding sites in multiple alignments. In addition, Mulan supports two-way communication with the GALA database; alignments of multiple species dynamically generated in GALA can be viewed in Mulan, and conserved transcription factor binding sites identified with Mulan/multiTF can be integrated and overlaid with extensive genome annotation data using GALA. Local multiple alignments computed by Mulan ensure reliable representation of short-and large-scale genomic rearrangements in distant organisms. Mulan allows for interactive modification of critical conservation parameters to differentially predict conserved regions in comparisons of both closely and distantly related species. We illustrate the uses and applications of the Mulan tool through multi-species comparisons of the GATA3 gene locus and the identification of elements that are conserved differently in avians than in other genomes allowing speculation on the evolution of birds. Source code for the aligners and the aligner-evaluation software can be freely downloaded from http://bio.cse.psu.edu/.

  2. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  3. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Science.gov (United States)

    Alkan, Can; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk; Eichler, Evan E

    2007-09-01

    The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  4. Exploring the past and the future of protein evolution with ancestral sequence reconstruction: the 'retro' approach to protein engineering.

    Science.gov (United States)

    Gumulya, Yosephine; Gillam, Elizabeth M J

    2017-01-01

    A central goal in molecular evolution is to understand the ways in which genes and proteins evolve in response to changing environments. In the absence of intact DNA from fossils, ancestral sequence reconstruction (ASR) can be used to infer the evolutionary precursors of extant proteins. To date, ancestral proteins belonging to eubacteria, archaea, yeast and vertebrates have been inferred that have been hypothesized to date from between several million to over 3 billion years ago. ASR has yielded insights into the early history of life on Earth and the evolution of proteins and macromolecular complexes. Recently, however, ASR has developed from a tool for testing hypotheses about protein evolution to a useful means for designing novel proteins. The strength of this approach lies in the ability to infer ancestral sequences encoding proteins that have desirable properties compared with contemporary forms, particularly thermostability and broad substrate range, making them good starting points for laboratory evolution. Developments in technologies for DNA sequencing and synthesis and computational phylogenetic analysis have led to an escalation in the number of ancient proteins resurrected in the last decade and greatly facilitated the use of ASR in the burgeoning field of synthetic biology. However, the primary challenge of ASR remains in accurately inferring ancestral states, despite the uncertainty arising from evolutionary models, incomplete sequences and limited phylogenetic trees. This review will focus, firstly, on the use of ASR to uncover links between sequence and phenotype and, secondly, on the practical application of ASR in protein engineering. © 2017 The Author(s); published by Portland Press Limited on behalf of the Biochemical Society.

  5. Sequence Matching Analysis for Curriculum Development

    Directory of Open Access Journals (Sweden)

    Liem Yenny Bendatu

    2015-06-01

    Full Text Available Many organizations apply information technologies to support their business processes. Using the information technologies, the actual events are recorded and utilized to conform with predefined model. Conformance checking is an approach to measure the fitness and appropriateness between process model and actual events. However, when there are multiple events with the same timestamp, the traditional approach unfit to result such measures. This study attempts to develop a sequence matching analysis. Considering conformance checking as the basis of this approach, this proposed approach utilizes the current control flow technique in process mining domain. A case study in the field of educational process has been conducted. This study also proposes a curriculum analysis framework to test the proposed approach. By considering the learning sequence of students, it results some measurements for curriculum development. Finally, the result of the proposed approach has been verified by relevant instructors for further development.

  6. BioMatriX: Sequence analysis, structure visualization, phylogenetics ...

    African Journals Online (AJOL)

    Goshi

    2012-04-26

    Apr 26, 2012 ... multi-functional services to perform specific tasks like DNA/RNA/Protein sequence analysis with graphical representations, sequence editing, sequence alignment, restriction enzyme mapping, protein structure visualization, mutation and structure superimposition programs along with phylogenetics tree.

  7. Perspective on sequence evolution of microsatellite locus (CCGn in Rv0050 gene from Mycobacterium tuberculosis

    Directory of Open Access Journals (Sweden)

    Jin Ruiliang

    2011-08-01

    Full Text Available Abstract Background The mycobacterial genome is inclined to polymerase slippage and a high mutation rate in microsatellite regions due to high GC content and absence of a mismatch repair system. However, the exact molecular mechanisms underlying microsatellite variation have not been fully elucidated. Here, we investigated mutation events in the hyper-variable trinucleotide microsatellite locus MML0050 located in the Rv0050 gene of W-Beijing and non-W-Beijing Mycobacterium tuberculosis strains in order to gain insight into the genomic structure and activity of repeated regions. Results Size analysis indicated the presence of five alleles that differed in length by three base pairs. Moreover, nucleotide gains occurred more frequently than loses in this trinucleotide microsatellite. Mutation frequency was not completely related with the total length, though the relative frequency in the longest allele was remarkably higher than that in the shortest. Sequence analysis was able to detect seven alleles and revealed that point mutations enhanced the level of locus variation. Introduction of an interruptive motif correlated with the total allele length and genetic lineage, rather than the length of the longest stretch of perfect repeats. Finally, the level of locus variation was drastically different between the two genetic lineages. Conclusion The Rv0050 locus encodes the bifunctional penicillin-binding protein ponA1 and is essential to mycobacterial survival. Our investigations of this particularly dynamic genomic region provide insights into the overall mode of microsatellite evolution. Specifically, replication slippage was implicated in the mutational process of this microsatellite and a sequence-based genetic analysis was necessary to determine that point mutation events acted to maintain microsatellite size integrity while providing genomic diversity.

  8. Reconstructing the dynamics of HIV evolution within hosts from serial deep sequence data.

    Directory of Open Access Journals (Sweden)

    Art F Y Poon

    Full Text Available At the early stage of infection, human immunodeficiency virus (HIV-1 predominantly uses the CCR5 coreceptor for host cell entry. The subsequent emergence of HIV variants that use the CXCR4 coreceptor in roughly half of all infections is associated with an accelerated decline of CD4+ T-cells and rate of progression to AIDS. The presence of a 'fitness valley' separating CCR5- and CXCR4-using genotypes is postulated to be a biological determinant of whether the HIV coreceptor switch occurs. Using phylogenetic methods to reconstruct the evolutionary dynamics of HIV within hosts enables us to discriminate between competing models of this process. We have developed a phylogenetic pipeline for the molecular clock analysis, ancestral reconstruction, and visualization of deep sequence data. These data were generated by next-generation sequencing of HIV RNA extracted from longitudinal serum samples (median 7 time points from 8 untreated subjects with chronic HIV infections (Amsterdam Cohort Studies on HIV-1 infection and AIDS. We used the known dates of sampling to directly estimate rates of evolution and to map ancestral mutations to a reconstructed timeline in units of days. HIV coreceptor usage was predicted from reconstructed ancestral sequences using the geno2pheno algorithm. We determined that the first mutations contributing to CXCR4 use emerged about 16 (per subject range 4 to 30 months before the earliest predicted CXCR4-using ancestor, which preceded the first positive cell-based assay of CXCR4 usage by 10 (range 5 to 25 months. CXCR4 usage arose in multiple lineages within 5 of 8 subjects, and ancestral lineages following alternate mutational pathways before going extinct were common. We observed highly patient-specific distributions and time-scales of mutation accumulation, implying that the role of a fitness valley is contingent on the genotype of the transmitted variant.

  9. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    Paola eCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract.As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  10. The genome sequence of taurine cattle: A window to ruminant biology and evolution

    Science.gov (United States)

    To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (ma...

  11. Protein sequence analysis using Hewlett-Packard biphasic sequencing cartridges in an applied biosystems 473A protein sequencer.

    Science.gov (United States)

    Tang, S; Mozdzanowski, J; Anumula, K R

    1999-01-01

    Protein sequence analysis using an adsorptive biphasic sequencing cartridge, a set of two coupled columns introduced by Hewlett-Packard for protein sequencing by Edman degradation, in an Applied Biosystems 473A protein sequencer has been demonstrated. Samples containing salts, detergents, excipients, etc. (e.g., formulated protein drugs) can be easily analyzed using the ABI sequencer. Simple modifications to the ABI sequencer to accommodate the cartridge extend its utility in the analysis of difficult samples. The ABI sequencer solvents and reagents were compatible with the HP cartridge for sequencing. Sequence information up to ten residues can be easily generated by this nonoptimized procedure, and it is sufficient for identifying proteins by database search and for preparing a DNA probe for cloning novel proteins.

  12. Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships.

    Science.gov (United States)

    Baier, F; Copp, J N; Tokuriki, N

    2016-11-22

    The sequence and functional diversity of enzyme superfamilies have expanded through billions of years of evolution from a common ancestor. Understanding how protein sequence and functional "space" have expanded, at both the evolutionary and molecular level, is central to biochemistry, molecular biology, and evolutionary biology. Integrative approaches that examine protein sequence, structure, and function have begun to provide comprehensive views of the functional diversity and evolutionary relationships within enzyme superfamilies. In this review, we outline the recent advances in our understanding of enzyme evolution and superfamily functional diversity. We describe the tools that have been used to comprehensively analyze sequence relationships and to characterize sequence and function relationships. We also highlight recent large-scale experimental approaches that systematically determine the activity profiles across enzyme superfamilies. We identify several intriguing insights from this recent body of work. First, promiscuous activities are prevalent among extant enzymes. Second, many divergent proteins retain "function connectivity" via enzyme promiscuity, which can be used to probe the evolutionary potential and history of enzyme superfamilies. Finally, we discuss open questions regarding the intricacies of enzyme divergence, as well as potential research directions that will deepen our understanding of enzyme superfamily evolution.

  13. Evolution of Endogenous Sequences of Banana Streak Virus: What Can We Learn from Banana (Musa sp.) Evolution?▿

    Science.gov (United States)

    Gayral, Philippe; Blondin, Laurence; Guidolin, Olivier; Carreel, Françoise; Hippolyte, Isabelle; Perrier, Xavier; Iskra-Caruana, Marie-Line

    2010-01-01

    Endogenous plant pararetroviruses (EPRVs) are viral sequences of the family Caulimoviridae integrated into the nuclear genome of numerous plant species. The ability of some endogenous sequences of Banana streak viruses (eBSVs) in the genome of banana (Musa sp.) to induce infections just like the virus itself was recently demonstrated (P. Gayral et al., J. Virol. 83:6697-6710, 2008). Although eBSVs probably arose from accidental events, infectious eBSVs constitute an extreme case of parasitism, as well as a newly described strategy for vertical virus transmission in plants. We investigated the early evolutionary stages of infectious eBSV for two distinct BSV species—GF (BSGFV) and Imové (BSImV)—through the study of their distribution, insertion polymorphism, and structure evolution among selected banana genotypes representative of the diversity of 60 wild Musa species and genotypes. To do so, the historical frame of host evolution was analyzed by inferring banana phylogeny from two chloroplast regions—matK and trnL-trnF—as well as from the nuclear genome, using 19 microsatellite loci. We demonstrated that both BSV species integrated recently in banana evolution, circa 640,000 years ago. The two infectious eBSVs were subjected to different selective pressures and showed distinct levels of rearrangement within their final structure. In addition, the molecular phylogenies of integrated and nonintegrated BSVs enabled us to establish the phylogenetic origins of eBSGFV and eBSImV. PMID:20427523

  14. Evolution of endogenous sequences of banana streak virus: what can we learn from banana (Musa sp.) evolution?

    Science.gov (United States)

    Gayral, Philippe; Blondin, Laurence; Guidolin, Olivier; Carreel, Françoise; Hippolyte, Isabelle; Perrier, Xavier; Iskra-Caruana, Marie-Line

    2010-07-01

    Endogenous plant pararetroviruses (EPRVs) are viral sequences of the family Caulimoviridae integrated into the nuclear genome of numerous plant species. The ability of some endogenous sequences of Banana streak viruses (eBSVs) in the genome of banana (Musa sp.) to induce infections just like the virus itself was recently demonstrated (P. Gayral et al., J. Virol. 83:6697-6710, 2008). Although eBSVs probably arose from accidental events, infectious eBSVs constitute an extreme case of parasitism, as well as a newly described strategy for vertical virus transmission in plants. We investigated the early evolutionary stages of infectious eBSV for two distinct BSV species-GF (BSGFV) and Imové (BSImV)-through the study of their distribution, insertion polymorphism, and structure evolution among selected banana genotypes representative of the diversity of 60 wild Musa species and genotypes. To do so, the historical frame of host evolution was analyzed by inferring banana phylogeny from two chloroplast regions-matK and trnL-trnF-as well as from the nuclear genome, using 19 microsatellite loci. We demonstrated that both BSV species integrated recently in banana evolution, circa 640,000 years ago. The two infectious eBSVs were subjected to different selective pressures and showed distinct levels of rearrangement within their final structure. In addition, the molecular phylogenies of integrated and nonintegrated BSVs enabled us to establish the phylogenetic origins of eBSGFV and eBSImV.

  15. Pig genome sequence - analysis and publication strategy

    NARCIS (Netherlands)

    Archibald, A.L.; Bolund, L.; Churcher, C.; Fredholm, M.; Groenen, M.A.M.; Harlizius, B.

    2010-01-01

    Background - The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. Results - Assemblies of the

  16. Molecular cloning, expression analysis and sequence prediction of ...

    African Journals Online (AJOL)

    Homologous comparison of the amino acid sequences from C/EBPβ cloned in this study and those from different species indicated C/EBPβ gene of Qinchuan cattle shared 97, 95 and 91% similarity with Homo sapiens, Sus scrofa and Oryctolagus cuniculus respectively, indicating a good sequence evolutional conservation ...

  17. Comparative analysis of sequences from PT 2013

    DEFF Research Database (Denmark)

    Mikkelsen, Susie Sommer

    . All but one sequence mapped to the MCP gene while the last sequence mapped to the Neurofilament gene. Approx. half of the sequences contained no errors while the rest differed with 88-99 percent similarity with most having 99% similarity. One sequence, when BLASTed, showed most similarity to European...... Sheatfish and not EHNV. Generally, mistakes occurred at the ends of the sequences. This can be due to several factors. One is that the sequence has not been trimmed of the sequence primer sites. Another is the lack of quality control of the chromatogram. Finally, sequencing in just one direction can result...

  18. Social network analysis community detection and evolution

    CERN Document Server

    Missaoui, Rokia

    2015-01-01

    This book is devoted to recent progress in social network analysis with a high focus on community detection and evolution. The eleven chapters cover the identification of cohesive groups, core components and key players either in static or dynamic networks of different kinds and levels of heterogeneity. Other important topics in social network analysis such as influential detection and maximization, information propagation, user behavior analysis, as well as network modeling and visualization are also presented. Many studies are validated through real social networks such as Twitter. This edit

  19. FAST: FAST Analysis of Sequences Toolbox.

    Science.gov (United States)

    Lawrence, Travis J; Kauffman, Kyle T; Amrine, Katherine C H; Carper, Dana L; Lee, Raymond S; Becich, Peter J; Canales, Claudia J; Ardell, David H

    2015-01-01

    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  20. FAST: FAST Analysis of Sequences Toolbox

    Directory of Open Access Journals (Sweden)

    Travis J. Lawrence

    2015-05-01

    Full Text Available FAST (FAST Analysis of Sequences Toolbox provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU’s Not Unix Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics makes FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format. Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  1. Bayesian Correlation Analysis for Sequence Count Data.

    Directory of Open Access Journals (Sweden)

    Daniel Sánchez-Taltavull

    Full Text Available Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities' measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low-especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities' signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset.

  2. A basic analysis toolkit for biological sequences

    Directory of Open Access Journals (Sweden)

    Siragusa Enrico

    2007-09-01

    Full Text Available Abstract This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at http://www.math.unipa.it/~raffaele/BATS/ under the GNU GPL.

  3. Modeling the expected lifetime and evolution of a deme's principal genetic sequence.

    Science.gov (United States)

    Clark, Brian

    2014-03-01

    The principal genetic sequence (PGS) is the most common genetic sequence in a deme. The PGS changes over time because new genetic sequences are created by inversions, compete with the current PGS, and a small fraction become PGSs. A set of coupled difference equations provides a description of the evolution of the PGS distribution function in an ensemble of demes. Solving the set of equations produces the survival probability of a new genetic sequence and the expected lifetime of an existing PGS as a function of inversion size and rate, recombination rate, and deme size. Additionally, the PGS distribution function is used to explain the transition pathway from old to new PGSs. We compare these results to a cellular automaton based representation of a deme and the drosophila species, D. melanogaster and D. yakuba.

  4. Virus evolution during chronic hepatitis B virus infection as revealed by ultradeep sequencing data.

    Science.gov (United States)

    Jones, Leandro R; Sede, Mariano; Manrique, Julieta M; Quarleri, Jorge

    2016-02-01

    Despite chronic hepatitis B virus (HBV) infection (CHB) being a leading cause of liver cirrhosis and cancer, HBV evolution during CHB is not fully understood. Recent studies have indicated that virus diversity progressively increases along the course of CHB and that some virus mutations correlate with severe liver conditions such as chronic hepatitis, cirrhosis and hepatocellular carcinoma. Using ultradeep sequencing (UDS) data from an intrafamilial case, we detected such mutations at low frequencies among three immunotolerant patients and at high frequencies in an inactive carrier. Furthermore, our analyses indicated that the HBV population from the seroconverter patient underwent many genetic changes in response to virus clearance. Together, these data indicate a potential use of UDS for developing non-invasive biomarkers for monitoring disease changes over time or in response to specific therapies. In addition, our analyses revealed that virus clearance seemed not to require the virus effective population size to decline. A detailed genetic analysis of the viral lineages arising during and after the clearance suggested that mutations at or close to critical elements of the core promoter (enhancer II, epsilon encapsidation signal, TA2, TA3 and direct repeat 1-hormone response element) might be responsible for a sustained replication. This hypothesis requires the decline in virus load to be explained by constant clearance of virus-producing hepatocytes, consistent with the sustained progress towards serious liver conditions experienced by many CHB patients.

  5. De novo transcriptome assembly of Zanthoxylum bungeanum using Illumina sequencing for evolutionary analysis and simple sequence repeat marker development.

    Science.gov (United States)

    Feng, Shijing; Zhao, Lili; Liu, Zhenshan; Liu, Yulin; Yang, Tuxi; Wei, Anzhi

    2017-12-01

    Zanthoxylum, an ancient economic crop in Asia, has a satisfying aromatic taste and immense medicinal values. A lack of genomic information and genetic markers has limited the evolutionary analysis and genetic improvement of Zanthoxylum species and their close relatives. To better understand the evolution, domestication, and divergence of Zanthoxylum, we present a de novo transcriptome analysis of an elite cultivar of Z. bungeanum using Illumina sequencing; we then developed simple sequence repeat markers for identification of Zanthoxylum. In total, we predicted 45,057 unigenes and 22,212 protein coding sequences, approximately 90% of which showed significant similarities to known proteins in databases. Phylogenetic analysis indicated that Zanthoxylum is relatively recent and estimated to have diverged from Citrus ca. 36.5-37.7 million years ago. We also detected a whole-genome duplication event in Zanthoxylum that occurred 14 million years ago. We found no protein coding sequences that were significantly under positive selection by Ka/Ks. Simple sequence repeat analysis divided 31 Zanthoxylum cultivars and landraces into three major groups. This Zanthoxylum reference transcriptome provides crucial information for the evolutionary study of the Zanthoxylum genus and the Rutaceae family, and facilitates the establishment of more effective Zanthoxylum breeding programs.

  6. Multilocus sequence analysis of the family Halomonadaceae.

    Science.gov (United States)

    de la Haba, Rafael R; Márquez, M Carmen; Papke, R Thane; Ventosa, Antonio

    2012-03-01

    Multilocus sequence analysis (MLSA) protocols have been developed for species circumscription for many taxa. However, at present, no studies based on MLSA have been performed within any moderately halophilic bacterial group. To test the usefulness of MLSA with these kinds of micro-organisms, the family Halomonadaceae, which includes mainly halophilic bacteria, was chosen as a model. This family comprises ten genera with validly published names and 85 species of environmental, biotechnological and clinical interest. In some cases, the phylogenetic relationships between members of this family, based on 16S rRNA gene sequence comparisons, are not clear and a deep phylogenetic analysis using several housekeeping genes seemed appropriate. Here, MLSA was applied using the 16S rRNA, 23S rRNA, atpA, gyrB, rpoD and secA genes for species of the family Halomonadaceae. Phylogenetic trees based on the individual and concatenated gene sequences revealed that the family Halomonadaceae formed a monophyletic group of micro-organisms within the order Oceanospirillales. With the exception of the genera Halomonas and Modicisalibacter, all other genera within this family were phylogenetically coherent. Five of the six studied genes (16S rRNA, 23S rRNA, gyrB, rpoD and secA) showed a consistent evolutionary history. However, the results obtained with the atpA gene were different; thus, this gene may not be considered useful as an individual gene phylogenetic marker within this family. The phylogenetic methods produced variable results, with those generated from the maximum-likelihood and neighbour-joining algorithms being more similar than those obtained by maximum-parsimony methods. Horizontal gene transfer (HGT) plays an important evolutionary role in the family Halomonadaceae; however, the impact of recombination events in the phylogenetic analysis was minimized by concatenating the six loci, which agreed with the current taxonomic scheme for this family. Finally, the findings of

  7. The genome sequence of taurine cattle: a window to ruminant biology and evolution.

    Science.gov (United States)

    Elsik, Christine G; Tellam, Ross L; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Weinstock, George M; Adelson, David L; Eichler, Evan E; Elnitski, Laura; Guigó, Roderic; Hamernik, Debora L; Kappes, Steve M; Lewin, Harris A; Lynn, David J; Nicholas, Frank W; Reymond, Alexandre; Rijnkels, Monique; Skow, Loren C; Zdobnov, Evgeny M; Schook, Lawrence; Womack, James; Alioto, Tyler; Antonarakis, Stylianos E; Astashyn, Alex; Chapple, Charles E; Chen, Hsiu-Chuan; Chrast, Jacqueline; Câmara, Francisco; Ermolaeva, Olga; Henrichsen, Charlotte N; Hlavina, Wratko; Kapustin, Yuri; Kiryutin, Boris; Kitts, Paul; Kokocinski, Felix; Landrum, Melissa; Maglott, Donna; Pruitt, Kim; Sapojnikov, Victor; Searle, Stephen M; Solovyev, Victor; Souvorov, Alexandre; Ucla, Catherine; Wyss, Carine; Anzola, Juan M; Gerlach, Daniel; Elhaik, Eran; Graur, Dan; Reese, Justin T; Edgar, Robert C; McEwan, John C; Payne, Gemma M; Raison, Joy M; Junier, Thomas; Kriventseva, Evgenia V; Eyras, Eduardo; Plass, Mireya; Donthu, Ravikiran; Larkin, Denis M; Reecy, James; Yang, Mary Q; Chen, Lin; Cheng, Ze; Chitko-McKown, Carol G; Liu, George E; Matukumalli, Lakshmi K; Song, Jiuzhou; Zhu, Bin; Bradley, Daniel G; Brinkman, Fiona S L; Lau, Lilian P L; Whiteside, Matthew D; Walker, Angela; Wheeler, Thomas T; Casey, Theresa; German, J Bruce; Lemay, Danielle G; Maqbool, Nauman J; Molenaar, Adrian J; Seo, Seongwon; Stothard, Paul; Baldwin, Cynthia L; Baxter, Rebecca; Brinkmeyer-Langford, Candice L; Brown, Wendy C; Childers, Christopher P; Connelley, Timothy; Ellis, Shirley A; Fritz, Krista; Glass, Elizabeth J; Herzig, Carolyn T A; Iivanainen, Antti; Lahmers, Kevin K; Bennett, Anna K; Dickens, C Michael; Gilbert, James G R; Hagen, Darren E; Salih, Hanni; Aerts, Jan; Caetano, Alexandre R; Dalrymple, Brian; Garcia, Jose Fernando; Gill, Clare A; Hiendleder, Stefan G; Memili, Erdogan; Spurlock, Diane; Williams, John L; Alexander, Lee; Brownstein, Michael J; Guan, Leluo; Holt, Robert A; Jones, Steven J M; Marra, Marco A; Moore, Richard; Moore, Stephen S; Roberts, Andy; Taniguchi, Masaaki; Waterman, Richard C; Chacko, Joseph; Chandrabose, Mimi M; Cree, Andy; Dao, Marvin Diep; Dinh, Huyen H; Gabisi, Ramatu Ayiesha; Hines, Sandra; Hume, Jennifer; Jhangiani, Shalini N; Joshi, Vandita; Kovar, Christie L; Lewis, Lora R; Liu, Yih-Shin; Lopez, John; Morgan, Margaret B; Nguyen, Ngoc Bich; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Wright, Rita A; Buhay, Christian; Ding, Yan; Dugan-Rocha, Shannon; Herdandez, Judith; Holder, Michael; Sabo, Aniko; Egan, Amy; Goodell, Jason; Wilczek-Boney, Katarzyna; Fowler, Gerald R; Hitchens, Matthew Edward; Lozado, Ryan J; Moen, Charles; Steffen, David; Warren, James T; Zhang, Jingkun; Chiu, Readman; Schein, Jacqueline E; Durbin, K James; Havlak, Paul; Jiang, Huaiyang; Liu, Yue; Qin, Xiang; Ren, Yanru; Shen, Yufeng; Song, Henry; Bell, Stephanie Nicole; Davis, Clay; Johnson, Angela Jolivet; Lee, Sandra; Nazareth, Lynne V; Patel, Bella Mayurkumar; Pu, Ling-Ling; Vattathil, Selina; Williams, Rex Lee; Curry, Stacey; Hamilton, Cerissa; Sodergren, Erica; Wheeler, David A; Barris, Wes; Bennett, Gary L; Eggen, André; Green, Ronnie D; Harhay, Gregory P; Hobbs, Matthew; Jann, Oliver; Keele, John W; Kent, Matthew P; Lien, Sigbjørn; McKay, Stephanie D; McWilliam, Sean; Ratnakumar, Abhirami; Schnabel, Robert D; Smith, Timothy; Snelling, Warren M; Sonstegard, Tad S; Stone, Roger T; Sugimoto, Yoshikazu; Takasuga, Akiko; Taylor, Jeremy F; Van Tassell, Curtis P; Macneil, Michael D; Abatepaulo, Antonio R R; Abbey, Colette A; Ahola, Virpi; Almeida, Iassudara G; Amadio, Ariel F; Anatriello, Elen; Bahadue, Suria M; Biase, Fernando H; Boldt, Clayton R; Carroll, Jeffery A; Carvalho, Wanessa A; Cervelatti, Eliane P; Chacko, Elsa; Chapin, Jennifer E; Cheng, Ye; Choi, Jungwoo; Colley, Adam J; de Campos, Tatiana A; De Donato, Marcos; Santos, Isabel K F de Miranda; de Oliveira, Carlo J F; Deobald, Heather; Devinoy, Eve; Donohue, Kaitlin E; Dovc, Peter; Eberlein, Annett; Fitzsimmons, Carolyn J; Franzin, Alessandra M; Garcia, Gustavo R; Genini, Sem; Gladney, Cody J; Grant, Jason R; Greaser, Marion L; Green, Jonathan A; Hadsell, Darryl L; Hakimov, Hatam A; Halgren, Rob; Harrow, Jennifer L; Hart, Elizabeth A; Hastings, Nicola; Hernandez, Marta; Hu, Zhi-Liang; Ingham, Aaron; Iso-Touru, Terhi; Jamis, Catherine; Jensen, Kirsty; Kapetis, Dimos; Kerr, Tovah; Khalil, Sari S; Khatib, Hasan; Kolbehdari, Davood; Kumar, Charu G; Kumar, Dinesh; Leach, Richard; Lee, Justin C-M; Li, Changxi; Logan, Krystin M; Malinverni, Roberto; Marques, Elisa; Martin, William F; Martins, Natalia F; Maruyama, Sandra R; Mazza, Raffaele; McLean, Kim L; Medrano, Juan F; Moreno, Barbara T; Moré, Daniela D; Muntean, Carl T; Nandakumar, Hari P; Nogueira, Marcelo F G; Olsaker, Ingrid; Pant, Sameer D; Panzitta, Francesca; Pastor, Rosemeire C P; Poli, Mario A; Poslusny, Nathan; Rachagani, Satyanarayana; Ranganathan, Shoba; Razpet, Andrej; Riggs, Penny K; Rincon, Gonzalo; Rodriguez-Osorio, Nelida; Rodriguez-Zas, Sandra L; Romero, Natasha E; Rosenwald, Anne; Sando, Lillian; Schmutz, Sheila M; Shen, Libing; Sherman, Laura; Southey, Bruce R; Lutzow, Ylva Strandberg; Sweedler, Jonathan V; Tammen, Imke; Telugu, Bhanu Prakash V L; Urbanski, Jennifer M; Utsunomiya, Yuri T; Verschoor, Chris P; Waardenberg, Ashley J; Wang, Zhiquan; Ward, Robert; Weikard, Rosemarie; Welsh, Thomas H; White, Stephen N; Wilming, Laurens G; Wunderlich, Kris R; Yang, Jianqi; Zhao, Feng-Qi

    2009-04-24

    To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.

  8. Insecticide resistance evolution with mixtures and sequences: a model-based explanation.

    Science.gov (United States)

    South, Andy; Hastings, Ian M

    2018-02-15

    Insecticide resistance threatens effective vector control, especially for mosquitoes and malaria. To manage resistance, recommended insecticide use strategies include mixtures, sequences and rotations. New insecticides are being developed and there is an opportunity to develop use strategies that limit the evolution of further resistance in the short term. A 2013 review of modelling and empirical studies of resistance points to the advantages of mixtures. However, there is limited recent, accessible modelling work addressing the evolution of resistance under different operational strategies. There is an opportunity to improve the level of mechanistic understanding within the operational community of how insecticide resistance can be expected to evolve in response to different strategies. This paper provides a concise, accessible description of a flexible model of the evolution of insecticide resistance. The model is used to develop a mechanistic picture of the evolution of insecticide resistance and how it is likely to respond to potential insecticide use strategies. The aim is to reach an audience unlikely to read a more detailed modelling paper. The model itself, as described here, represents two independent genes coding for resistance to two insecticides. This allows the representation of the use of insecticides in isolation, sequence and mixtures. The model is used to demonstrate the evolution of resistance under different scenarios and how this fits with intuitive reasoning about selection pressure. Using an insecticide in a mixture, relative to alone, always prompts slower evolution of resistance to that insecticide. However, when resistance to both insecticides is considered, resistance thresholds may be reached later for a sequence relative to a mixture. Increasing the ability of insecticides to kill susceptible mosquitoes (effectiveness), has the most influence on favouring a mixture over a sequence because one highly effective insecticide provides more

  9. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  10. Time fluctuation analysis of forest fire sequences

    Science.gov (United States)

    Vega Orozco, Carmen D.; Kanevski, Mikhaïl; Tonini, Marj; Golay, Jean; Pereira, Mário J. G.

    2013-04-01

    Forest fires are complex events involving both space and time fluctuations. Understanding of their dynamics and pattern distribution is of great importance in order to improve the resource allocation and support fire management actions at local and global levels. This study aims at characterizing the temporal fluctuations of forest fire sequences observed in Portugal, which is the country that holds the largest wildfire land dataset in Europe. This research applies several exploratory data analysis measures to 302,000 forest fires occurred from 1980 to 2007. The applied clustering measures are: Morisita clustering index, fractal and multifractal dimensions (box-counting), Ripley's K-function, Allan Factor, and variography. These algorithms enable a global time structural analysis describing the degree of clustering of a point pattern and defining whether the observed events occur randomly, in clusters or in a regular pattern. The considered methods are of general importance and can be used for other spatio-temporal events (i.e. crime, epidemiology, biodiversity, geomarketing, etc.). An important contribution of this research deals with the analysis and estimation of local measures of clustering that helps understanding their temporal structure. Each measure is described and executed for the raw data (forest fires geo-database) and results are compared to reference patterns generated under the null hypothesis of randomness (Poisson processes) embedded in the same time period of the raw data. This comparison enables estimating the degree of the deviation of the real data from a Poisson process. Generalizations to functional measures of these clustering methods, taking into account the phenomena, were also applied and adapted to detect time dependences in a measured variable (i.e. burned area). The time clustering of the raw data is compared several times with the Poisson processes at different thresholds of the measured function. Then, the clustering measure value

  11. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece

    2014-04-03

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis in real time from variant call format (VCF) and associated metadata files. Allele frequency map, geographical map of isolates, Tajima\\'s D metric, single nucleotide polymorphism density, GC and variation density are also available for visualization in real time. We demonstrate the utility of SVAMP in tracking a methicillin-resistant Staphylococcus aureus outbreak from published next-generation sequencing data across 15 countries. We also demonstrate the scalability and accuracy of our software on 245 Plasmodium falciparum malaria isolates from three continents. Availability and implementation: The Qt/C++ software code, binaries, user manual and example datasets are available at http://cbrc.kaust.edu.sa/svamp. © The Author 2014.

  12. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies...... of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through the Pre-Ensembl/Ensembl browsers. The current annotated genome assembly (Sscrofa9) was released with Ensembl 56 in September 2009. A revised assembly (Sscrofa10......) is under construction and will incorporate whole genome shotgun sequence (WGS) data providing > 30x genome coverage. The WGS sequence, most of which comprise short Illumina/Solexa reads, were generated from DNA from the same single Duroc sow as the source of the BAC library from which clones were...

  13. End-sequence profiling : Sequence-based analysis of aberrant genomes

    NARCIS (Netherlands)

    Volik, S; Zhao, SY; Chin, K; Brebner, JH; Herndon, DR; Tao, QZ; Kowbel, D; Huang, GQ; Lapuk, A; Kuo, WL; Magrane, G; de Jong, P; Gray, JW; Collins, C

    2003-01-01

    Genome rearrangements are important in evolution, cancer, and other diseases. Precise mapping of the rearrangements is essential for identification of the involved genes, and many techniques have been developed for this purpose. We show here that end-sequence profiling (ESP) is particularly well

  14. Movement Pattern Analysis Based on Sequence Signatures

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Chavoshi

    2015-09-01

    Full Text Available Increased affordability and deployment of advanced tracking technologies have led researchers from various domains to analyze the resulting spatio-temporal movement data sets for the purpose of knowledge discovery. Two different approaches can be considered in the analysis of moving objects: quantitative analysis and qualitative analysis. This research focuses on the latter and uses the qualitative trajectory calculus (QTC, a type of calculus that represents qualitative data on moving point objects (MPOs, and establishes a framework to analyze the relative movement of multiple MPOs. A visualization technique called sequence signature (SESI is used, which enables to map QTC patterns in a 2D indexed rasterized space in order to evaluate the similarity of relative movement patterns of multiple MPOs. The applicability of the proposed methodology is illustrated by means of two practical examples of interacting MPOs: cars on a highway and body parts of a samba dancer. The results show that the proposed method can be effectively used to analyze interactions of multiple MPOs in different domains.

  15. Analysis of Neuronal Sequences Using Pairwise Biases

    Science.gov (United States)

    2015-08-27

    Traversal of a series of place fields results in the generation of a neuronal sequence called a place-field sequence , such as pictured in Figure 1.1. It...16. SECURITY CLASSIFICATION OF: Sequences of neuronal activation have long been implicated in a variety of brain func- tions. In particular, these... sequences have been tied to memory formation and spatial navigation in the hippocampus, a region of mammalian brains. Traditionally, neu- ronal

  16. Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing

    Science.gov (United States)

    Ding, Li; Ley, Timothy J.; Larson, David E.; Miller, Christopher A.; Koboldt, Daniel C.; Welch, John S.; Ritchey, Julie K.; Young, Margaret A.; Lamprecht, Tamara; McLellan, Michael D.; McMichael, Joshua F.; Wallis, John W.; Lu, Charles; Shen, Dong; Harris, Christopher C.; Dooling, David J.; Fulton, Robert S.; Fulton, Lucinda L.; Chen, Ken; Schmidt, Heather; Kalicki-Veizer, Joelle; Magrini, Vincent J.; Cook, Lisa; McGrath, Sean D.; Vickery, Tammi L.; Wendl, Michael C.; Heath, Sharon; Watson, Mark A.; Link, Daniel C.; Tomasson, Michael H.; Shannon, William D.; Payton, Jacqueline E.; Kulkarni, Shashikant; Westervelt, Peter; Walter, Matthew J.; Graubert, Timothy A.; Mardis, Elaine R.; Wilson, Richard K.; DiPersio, John F.

    2011-01-01

    Summary Most patients with acute myeloid leukemia (AML) die from progressive disease after relapse, which is associated with clonal evolution at the cytogenetic level1,2. To determine the mutational spectrum associated with relapse, we sequenced the primary tumor and relapse genomes from 8 AML patients, and validated hundreds of somatic mutations using deep sequencing; this allowed us to precisely define clonality and clonal evolution patterns at relapse. Besides discovering novel, recurrently mutated genes (e.g. WAC, SMC3, DIS3, DDX41, and DAXX) in AML, we found two major clonal evolution patterns during AML relapse: 1) the founding clone in the primary tumor gained mutations and evolved into the relapse clone, or 2) a subclone of the founding clone survived initial therapy, gained additional mutations, and expanded at relapse. In all cases, chemotherapy failed to eradicate the founding clone. The comparison of relapse-specific vs. primary tumor mutations in all 8 cases revealed an increase in transversions, probably due to DNA damage caused by cytotoxic chemotherapy. These data demonstrate that AML relapse is associated with the addition of new mutations and clonal evolution, which is shaped in part by the chemotherapy that the patients receive to establish and maintain remissions. PMID:22237025

  17. Diversity, Distribution, and Evolution of Tomato Viruses in China Uncovered by Small RNA Sequencing.

    Science.gov (United States)

    Xu, Chenxi; Sun, Xuepeng; Taylor, Angela; Jiao, Chen; Xu, Yimin; Cai, Xiaofeng; Wang, Xiaoli; Ge, Chenhui; Pan, Guanghui; Wang, Quanxi; Fei, Zhangjun; Wang, Quanhua

    2017-06-01

    Tomato is a major vegetable crop that has tremendous popularity. However, viral disease is still a major factor limiting tomato production. Here, we report the tomato virome identified through sequencing small RNAs of 170 field-grown samples collected in China. A total of 22 viruses were identified, including both well-documented and newly detected viruses. The tomato viral community is dominated by a few species, and they exhibit polymorphisms and recombination in the genomes with cold spots and hot spots. Most samples were coinfected by multiple viruses, and the majority of identified viruses are positive-sense single-stranded RNA viruses. Evolutionary analysis of one of the most dominant tomato viruses, Tomato yellow leaf curl virus (TYLCV), predicts its origin and the time back to its most recent common ancestor. The broadly sampled data have enabled us to identify several unreported viruses in tomato, including a completely new virus, which has a genome of ∼13.4 kb and groups with aphid-transmitted viruses in the genus Cytorhabdovirus Although both DNA and RNA viruses can trigger the biogenesis of virus-derived small interfering RNAs (vsiRNAs), we show that features such as length distribution, paired distance, and base selection bias of vsiRNA sequences reflect different plant Dicer-like proteins and Argonautes involved in vsiRNA biogenesis. Collectively, this study offers insights into host-virus interaction in tomato and provides valuable information to facilitate the management of viral diseases. IMPORTANCE Tomato is an important source of micronutrients in the human diet and is extensively consumed around the world. Virus is among the major constraints on tomato production. Categorizing virus species that are capable of infecting tomato and understanding their diversity and evolution are challenging due to difficulties in detecting such fast-evolving biological entities. Here, we report the landscape of the tomato virome in China, the leading country in

  18. Mo-MuLV nucleotide sequence exhibits three levels of oligomeric repetitions, suggesting a stepwise molecular evolution.

    Science.gov (United States)

    Laprevotte, I

    1992-11-01

    An exhaustive computer-assisted analysis of the Moloney murine leukemia virus nucleotide sequence shows numerous deviations in the oligomeric distribution, suggesting three overlapping levels of a stepwise duplicative evolution. (1) The sequence fits the universal rule of TG/CT excess which has been proposed as the construction principle of all sequences, and maintains some degree of symmetry between the two complementary strands. (2) Oligomeric repeating units share a core consensus regularly scattered throughout the sequence. This consensus is not merely predictable from the doublet frequencies and codon usage, but could correspond to an intermediary stage in a so-called periodic-to-chaotic transition. (3) Probable stepwise local duplications could be accounted for by slippagelike mechanisms. Comparison with the human spumaretrovirus (HSRV) shows similar segments in the overrepresented oligomers of the two sequences. The intermediary stage of transition oligomeric repeating units is not so clearly suggested in HSRV, perhaps because of numerous stepwise local duplications. In any case, a common evolutionary origin for the two viruses is not ruled out.

  19. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  20. Image sequence analysis workstation for multipoint motion analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-08-01

    This paper describes an application-specific engineering workstation designed and developed to analyze motion of objects from video sequences. The system combines the software and hardware environment of a modem graphic-oriented workstation with the digital image acquisition, processing and display techniques. In addition to automation and Increase In throughput of data reduction tasks, the objective of the system Is to provide less invasive methods of measurement by offering the ability to track objects that are more complex than reflective markers. Grey level Image processing and spatial/temporal adaptation of the processing parameters is used for location and tracking of more complex features of objects under uncontrolled lighting and background conditions. The applications of such an automated and noninvasive measurement tool include analysis of the trajectory and attitude of rigid bodies such as human limbs, robots, aircraft in flight, etc. The system's key features are: 1) Acquisition and storage of Image sequences by digitizing and storing real-time video; 2) computer-controlled movie loop playback, freeze frame display, and digital Image enhancement; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored Image sequence; 4) model-based estimation and tracking of the six degrees of freedom of a rigid body: 5) field-of-view and spatial calibration: 6) Image sequence and measurement data base management; and 7) offline analysis software for trajectory plotting and statistical analysis.

  1. Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores

    Directory of Open Access Journals (Sweden)

    Maréchal Eric

    2008-08-01

    Full Text Available Abstract Background Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. Two statistical models have been proposed. In the asymptotic limit of long sequences, the Karlin-Altschul model is based on the computation of a P-value, assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Alternatively, the Lipman-Pearson model is based on the computation of a Z-value from a random score distribution obtained by a Monte-Carlo simulation. Z-values allow the deduction of an upper bound of the P-value (1/Z-value2 following the TULIP theorem. Simulations of Z-value distribution is known to fit with a Gumbel law. This remarkable property was not demonstrated and had no obvious biological support. Results We built a model of evolution of sequences based on aging, as meant in Reliability Theory, using the fact that the amount of information shared between an initial sequence and the sequences in its lineage (i.e., mutual information in Information Theory is a decreasing function of time. This quantity is simply measured by a sequence alignment score. In systems aging, the failure rate is related to the systems longevity. The system can be a machine with structured components, or a living entity or population. "Reliability" refers to the ability to operate properly according to a standard. Here, the "reliability" of a sequence refers to the ability to conserve a sufficient functional level at the folded and maturated protein level (positive selection pressure. Homologous sequences were considered as systems 1 having a high redundancy of information reflected by the magnitude of their alignment scores, 2 which components are the amino acids that can independently be damaged by random DNA mutations. From these assumptions, we deduced that information shared at each amino acid position evolved with a

  2. Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores.

    Science.gov (United States)

    Bastien, Olivier; Maréchal, Eric

    2008-08-07

    Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. Two statistical models have been proposed. In the asymptotic limit of long sequences, the Karlin-Altschul model is based on the computation of a P-value, assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Alternatively, the Lipman-Pearson model is based on the computation of a Z-value from a random score distribution obtained by a Monte-Carlo simulation. Z-values allow the deduction of an upper bound of the P-value (1/Z-value2) following the TULIP theorem. Simulations of Z-value distribution is known to fit with a Gumbel law. This remarkable property was not demonstrated and had no obvious biological support. We built a model of evolution of sequences based on aging, as meant in Reliability Theory, using the fact that the amount of information shared between an initial sequence and the sequences in its lineage (i.e., mutual information in Information Theory) is a decreasing function of time. This quantity is simply measured by a sequence alignment score. In systems aging, the failure rate is related to the systems longevity. The system can be a machine with structured components, or a living entity or population. "Reliability" refers to the ability to operate properly according to a standard. Here, the "reliability" of a sequence refers to the ability to conserve a sufficient functional level at the folded and maturated protein level (positive selection pressure). Homologous sequences were considered as systems 1) having a high redundancy of information reflected by the magnitude of their alignment scores, 2) which components are the amino acids that can independently be damaged by random DNA mutations. From these assumptions, we deduced that information shared at each amino acid position evolved with a constant rate, corresponding to the

  3. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    STORAGESEVER

    2010-07-19

    Jul 19, 2010 ... Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and phylogenetic analysis showed that our isolates have high homology with Japanese isolates. Key words: Hepatitis C virus, core, phylogenetic analysis, Pakistan. INTRODUCTION.

  4. Retroviral oligonucleotide distributions correlate with biased nucleotide compositions of retrovirus sequences, suggesting a duplicative stepwise molecular evolution.

    Science.gov (United States)

    Laprevotte, I; Brouillet, S; Terzian, C; Hénaut, A

    1997-02-01

    A computer-assisted analysis was made of 24 complete nucleotide sequences selected from the vertebrate retroviruses to represent the ten viral groups. The conclusions of this analysis extend and strengthen the previously made hypothesis on the Moloney murine leukemia virus: The evolution of the nucleotide sequence appears to have occurred mainly through at least three overlapping levels of duplication: (1) The distributions of overrepresented (3-6)-mers are consistent with the universal rule of a trend toward TG/CT excess and with the persistence of a certain degree of symmetry between the two strands of DNA. This suggests one or several original tandemly repeated sequences and some inverted duplications. (2) The existence of two general core consensuses at the level of these (3-6)-mers supports the hypothesis of a common evolutionary origin of vertebrate retroviruses. Consensuses more specific to certain sequences are compatible with phylogenetic trees established independently. The consensuses could correspond to intermediary evolutionary stages. (3) Most of the (3-6)-mers with a significantly higher than average frequency appear to be internally repeated (with monomeric or oligomeric internal iterations) and seem to be at least partly the cause of the bias observed by other researchers at the level of retroviral nucleotide composition. They suggest a third evolutionary stage by slippage-like stepwise local duplications.

  5. Efficient computational methods for sequence analysis of small RNAs

    OpenAIRE

    Cozen, Gozde

    2007-01-01

    With the discovery of small regulatory RNAs, there has been a tremendous increase in the number of RNA sequencing projects. Meanwhile, novel high-throughput sequencing technologies, which can sequence as much as 500000 small RNA sequences in one run, have emerged. The challenge of processing this rapidly growing data can be addressed by optimizing current analysis approaches for small RNA sequences. We present fast register-level methods for small RNA pairwise alignment and small RNA to genom...

  6. Population genetics and molecular evolution of DNA sequences in transposable elements. I. A simulation framework.

    Science.gov (United States)

    Kijima, T E; Innan, Hideki

    2013-11-01

    A population genetic simulation framework is developed to understand the behavior and molecular evolution of DNA sequences of transposable elements. Our model incorporates random transposition and excision of transposable element (TE) copies, two modes of selection against TEs, and degeneration of transpositional activity by point mutations. We first investigated the relationships between the behavior of the copy number of TEs and these parameters. Our results show that when selection is weak, the genome can maintain a relatively large number of TEs, but most of them are less active. In contrast, with strong selection, the genome can maintain only a limited number of TEs but the proportion of active copies is large. In such a case, there could be substantial fluctuations of the copy number over generations. We also explored how DNA sequences of TEs evolve through the simulations. In general, active copies form clusters around the original sequence, while less active copies have long branches specific to themselves, exhibiting a star-shaped phylogeny. It is demonstrated that the phylogeny of TE sequences could be informative to understand the dynamics of TE evolution.

  7. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  8. Sequences from first settlers reveal rapid evolution in Icelandic mtDNA pool.

    Science.gov (United States)

    Helgason, Agnar; Lalueza-Fox, Carles; Ghosh, Shyamali; Sigurethardóttir, Sigrún; Sampietro, Maria Lourdes; Gigli, Elena; Baker, Adam; Bertranpetit, Jaume; Arnadóttir, Lilja; Thornorsteinsdottir, Unnur; Stefánsson, Kári

    2009-01-01

    A major task in human genetics is to understand the nature of the evolutionary processes that have shaped the gene pools of contemporary populations. Ancient DNA studies have great potential to shed light on the evolution of populations because they provide the opportunity to sample from the same population at different points in time. Here, we show that a sample of mitochondrial DNA (mtDNA) control region sequences from 68 early medieval Icelandic skeletal remains is more closely related to sequences from contemporary inhabitants of Scotland, Ireland, and Scandinavia than to those from the modern Icelandic population. Due to a faster rate of genetic drift in the Icelandic mtDNA pool during the last 1,100 years, the sequences carried by the first settlers were better preserved in their ancestral gene pools than among their descendants in Iceland. These results demonstrate the inferential power gained in ancient DNA studies through the application of population genetics analyses to relatively large samples.

  9. The evolved slowly pulsating B star 18 Peg: A testbed for upper main sequence stellar evolution

    Science.gov (United States)

    Irrgang, A.; De Cat, P.; Tkachenko, A.; Deshpande, A.; Moehler, S.; Mugrauer, M.; Janousch, D.

    2017-09-01

    The bright B3 III giant star 18 Peg turns out to be a slowly pulsating B star in a long period binary with a main-sequence star or a neutron star as companion. Given that it is one of the most evolved members of this class of massive pulsating stars, an accurate determination of the location of 18 Peg in the Hertzsprung-Russell (H-R) diagram would provide a lower limit on the width of the upper main sequence and hence would reveal information about the efficiency of the convective overshooting. We explain why long-term space-based observations are needed and how BRITE could play a crucial role in the gathering of the mandatory ingredients to test the models of the upper main sequence evolution.

  10. Genetic evolution of pancreatic cancer: lessons learnt from the pancreatic cancer genome sequencing project

    Science.gov (United States)

    Iacobuzio-Donahue, Christine A

    2012-01-01

    Pancreatic cancer is a disease caused by the accumulation of genetic alterations in specific genes. Elucidation of the human genome sequence, in conjunction with technical advances in the ability to perform whole exome sequencing, have provided new insight into the mutational spectra characteristic of this lethal tumour type. Most recently, exomic sequencing has been used to clarify the clonal evolution of pancreatic cancer as well as provide time estimates of pancreatic carcinogenesis, indicating that a long window of opportunity may exist for early detection of this disease while in the curative stage. Moving forward, these mutational analyses indicate potential targets for personalised diagnostic and therapeutic intervention as well as the optimal timing for intervention based on the natural history of pancreatic carcinogenesis and progression. PMID:21749982

  11. Sequence and expression variation in SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1): homeolog evolution in Indian Brassicas.

    Science.gov (United States)

    Sri, Tanu; Mayee, Pratiksha; Singh, Anandita

    2015-09-01

    Whole genome sequence analyses allow unravelling such evolutionary consequences of meso-triplication event in Brassicaceae (∼14-20 million years ago (MYA)) as differential gene fractionation and diversification in homeologous sub-genomes. This study presents a simple gene-centric approach involving microsynteny and natural genetic variation analysis for understanding SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1) homeolog evolution in Brassica. Analysis of microsynteny in Brassica rapa homeologous regions containing SOC1 revealed differential gene fractionation correlating to reported fractionation status of sub-genomes of origin, viz. least fractionated (LF), moderately fractionated 1 (MF1) and most fractionated (MF2), respectively. Screening 18 cultivars of 6 Brassica species led to the identification of 8 genomic and 27 transcript variants of SOC1, including splice-forms. Co-occurrence of both interrupted and intronless SOC1 genes was detected in few Brassica species. In silico analysis characterised Brassica SOC1 as MADS intervening, K-box, C-terminal (MIKC(C)) transcription factor, with highly conserved MADS and I domains relative to K-box and C-terminal domain. Phylogenetic analyses and multiple sequence alignments depicting shared pattern of silent/non-silent mutations assigned Brassica SOC1 homologs into groups based on shared diploid base genome. In addition, a sub-genome structure in uncharacterised Brassica genomes was inferred. Expression analysis of putative MF2 and LF (Brassica diploid base genome A (AA)) sub-genome-specific SOC1 homeologs of Brassica juncea revealed near identical expression pattern. However, MF2-specific homeolog exhibited significantly higher expression implying regulatory diversification. In conclusion, evidence for polyploidy-induced sequence and regulatory evolution in Brassica SOC1 is being presented wherein differential homeolog expression is implied in functional diversification.

  12. Los Alamos sequence analysis package for nucleic acids and proteins.

    OpenAIRE

    Kanehisa, M I

    1982-01-01

    An interactive system for computer analysis of nucleic acid and protein sequences has been developed for the Los Alamos DNA Sequence Database. It provides a convenient way to search or verify various sequence features, e.g., restriction enzyme sites, protein coding frames, and properties of coded proteins. Further, the comprehensive analysis package on a large-scale database can be used for comparative studies on sequence and structural homologies in order to find unnoted information stored i...

  13. EventThread: Visual Summarization and Stage Analysis of Event Sequence Data.

    Science.gov (United States)

    Guo, Shunan; Xu, Ke; Zhao, Rongwen; Gotz, David; Zha, Hongyuan; Cao, Nan

    2018-01-01

    Event sequence data such as electronic health records, a person's academic records, or car service records, are ordered series of events which have occurred over a period of time. Analyzing collections of event sequences can reveal common or semantically important sequential patterns. For example, event sequence analysis might reveal frequently used care plans for treating a disease, typical publishing patterns of professors, and the patterns of service that result in a well-maintained car. It is challenging, however, to visually explore large numbers of event sequences, or sequences with large numbers of event types. Existing methods focus on extracting explicitly matching patterns of events using statistical analysis to create stages of event progression over time. However, these methods fail to capture latent clusters of similar but not identical evolutions of event sequences. In this paper, we introduce a novel visualization system named EventThread which clusters event sequences into threads based on tensor analysis and visualizes the latent stage categories and evolution patterns by interactively grouping the threads by similarity into time-specific clusters. We demonstrate the effectiveness of EventThread through usage scenarios in three different application domains and via interviews with an expert user.

  14. Phylogenetic analysis of burkholderia species by multilocus sequence analysis.

    Science.gov (United States)

    Estrada-de los Santos, Paulina; Vinuesa, Pablo; Martínez-Aguilar, Lourdes; Hirsch, Ann M; Caballero-Mellado, Jesús

    2013-07-01

    Burkholderia comprises more than 60 species of environmental, clinical, and agro-biotechnological relevance. Previous phylogenetic analyses of 16S rRNA, recA, gyrB, rpoB, and acdS gene sequences as well as genome sequence comparisons of different Burkholderia species have revealed two major species clusters. In this study, we undertook a multilocus sequence analysis of 77 type and reference strains of Burkholderia using atpD, gltB, lepA, and recA genes in combination with the 16S rRNA gene sequence and employed maximum likelihood and neighbor-joining criteria to test this further. The phylogenetic analysis revealed, with high supporting values, distinct lineages within the genus Burkholderia. The two large groups were named A and B, whereas the B. rhizoxinica/B. endofungorum, and B. andropogonis groups consisted of two and one species, respectively. The group A encompasses several plant-associated and saprophytic bacterial species. The group B comprises the B. cepacia complex (opportunistic human pathogens), the B. pseudomallei subgroup, which includes both human and animal pathogens, and an assemblage of plant pathogenic species. The distinct lineages present in Burkholderia suggest that each group might represent a different genus. However, it will be necessary to analyze the full set of Burkholderia species and explore whether enough phenotypic features exist among the different clusters to propose that these groups should be considered separate genera.

  15. SVAMP: sequence variation analysis, maps and phylogeny.

    Science.gov (United States)

    Naeem, Raeece; Hidayah, Lailatul; Preston, Mark D; Clark, Taane G; Pain, Arnab

    2014-08-01

    SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis in real time from variant call format (VCF) and associated metadata files. Allele frequency map, geographical map of isolates, Tajima's D metric, single nucleotide polymorphism density, GC and variation density are also available for visualization in real time. We demonstrate the utility of SVAMP in tracking a methicillin-resistant Staphylococcus aureus outbreak from published next-generation sequencing data across 15 countries. We also demonstrate the scalability and accuracy of our software on 245 Plasmodium falciparum malaria isolates from three continents. The Qt/C++ software code, binaries, user manual and example datasets are available at http://cbrc.kaust.edu.sa/svamp arnab.pain@kaust.edu.sa or arnab.pain@cantab.net Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  16. Sequencing small RNA: introduction and data analysis fundamentals.

    Science.gov (United States)

    Mehta, Jai Prakash

    2014-01-01

    Small RNAs are important transcriptional regulators within cells. With the advent of powerful Next Generation Sequencing platforms, sequencing small RNAs seems to be an obvious choice to understand their expression and its downstream effect. Additionally, sequencing provides an opportunity to identify novel and polymorphic miRNA. However, the biggest challenge is the appropriate data analysis pipeline, which is still in phase of active development by various academic groups. This chapter describes basic and advanced steps for small RNA sequencing analysis including quality control, small RNA alignment and quantification, differential expression analysis, novel small RNA identification, target prediction, and downstream analysis. We also provide a list of various resources for small RNA analysis.

  17. Novel algorithms for protein sequence analysis

    NARCIS (Netherlands)

    Ye, Kai

    2008-01-01

    Each protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology”s paradigm is that this order of amino acids determines the protein”s architecture and function. In this thesis, we introduce novel algorithms to analyze protein sequences. Chapter 1

  18. Project Report: Automatic Sequence Processor Software Analysis

    Science.gov (United States)

    Benjamin, Brandon

    2011-01-01

    The Mission Planning and Sequencing (MPS) element of Multi-Mission Ground System and Services (MGSS) provides space missions with multi-purpose software to plan spacecraft activities, sequence spacecraft commands, and then integrate these products and execute them on spacecraft. Jet Propulsion Laboratory (JPL) is currently is flying many missions. The processes for building, integrating, and testing the multi-mission uplink software need to be improved to meet the needs of the missions and the operations teams that command the spacecraft. The Multi-Mission Sequencing Team is responsible for collecting and processing the observations, experiments and engineering activities that are to be performed on a selected spacecraft. The collection of these activities is called a sequence and ultimately a sequence becomes a sequence of spacecraft commands. The operations teams check the sequence to make sure that no constraints are violated. The workflow process involves sending a program start command, which activates the Automatic Sequence Processor (ASP). The ASP is currently a file-based system that is comprised of scripts written in perl, c-shell and awk. Once this start process is complete, the system checks for errors and aborts if there are any; otherwise the system converts the commands to binary, and then sends the resultant information to be radiated to the spacecraft.

  19. Phylogeny and character evolution of the coprinoid mushroom genus Parasola as inferred from LSU and ITS nr DNA sequence data

    NARCIS (Netherlands)

    Nagy, L.G.; Kocsubé, S.; Papp, T.; Vágvölgyi, C.

    2009-01-01

    Phylogenetic relationships, species concepts and morphological evolution of the coprinoid mushroom genus Parasola were studied. A combined dataset of nuclear ribosomal ITS and LSU sequences was used to infer phylogenetic relationships of Parasola species and several outgroup taxa. Clades recovered

  20. A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A Virus

    National Research Council Canada - National Science Library

    Kosakovsky Pond, Sergei L; Poon, Art F.Y; Leigh Brown, Andrew J; Frost, Simon D.W

    2008-01-01

    We develop a model-based phylogenetic maximum likelihood test for evidence of preferential substitution toward a given residue at individual positions of a protein alignment-directional evolution of protein sequences (DEPS...

  1. Chloroplast phylogenomic data from the green algal order Sphaeropleales (Chlorophyceae, Chlorophyta) reveal complex patterns of sequence evolution.

    Science.gov (United States)

    Fučíková, Karolina; Lewis, Paul O; Lewis, Louise A

    2016-05-01

    Chloroplast sequence data are widely used to infer phylogenies of plants and algae. With the increasing availability of complete chloroplast genome sequences, the opportunity arises to resolve ancient divergences that were heretofore problematic. On the flip side, properly analyzing large multi-gene data sets can be a major challenge, as these data may be riddled with systematic biases and conflicting signals. Our study contributes new data from nine complete and four fragmentary chloroplast genome sequences across the green algal order Sphaeropleales. Our phylogenetic analyses of a 56-gene data set show that analyzing these data on a nucleotide level yields a well-supported phylogeny - yet one that is quite different from a corresponding amino acid analysis. We offer some possible explanations for this conflict through a range of analyses of modified data sets. In addition, we characterize the newly sequenced genomes in terms of their structure and content, thereby further contributing to the knowledge of chloroplast genome evolution. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Sequencing papaya X and Yh chromosomes reveals molecular basis of incipient sex chromosome evolution.

    Science.gov (United States)

    Wang, Jianping; Na, Jong-Kuk; Yu, Qingyi; Gschwend, Andrea R; Han, Jennifer; Zeng, Fanchang; Aryal, Rishi; VanBuren, Robert; Murray, Jan E; Zhang, Wenli; Navajas-Pérez, Rafael; Feltus, F Alex; Lemke, Cornelia; Tong, Eric J; Chen, Cuixia; Wai, Ching Man; Singh, Ratnesh; Wang, Ming-Li; Min, Xiang Jia; Alam, Maqsudul; Charlesworth, Deborah; Moore, Paul H; Jiang, Jiming; Paterson, Andrew H; Ming, Ray

    2012-08-21

    Sex determination in papaya is controlled by a recently evolved XY chromosome pair, with two slightly different Y chromosomes controlling the development of males (Y) and hermaphrodites (Y(h)). To study the events of early sex chromosome evolution, we sequenced the hermaphrodite-specific region of the Y(h) chromosome (HSY) and its X counterpart, yielding an 8.1-megabase (Mb) HSY pseudomolecule, and a 3.5-Mb sequence for the corresponding X region. The HSY is larger than the X region, mostly due to retrotransposon insertions. The papaya HSY differs from the X region by two large-scale inversions, the first of which likely caused the recombination suppression between the X and Y(h) chromosomes, followed by numerous additional chromosomal rearrangements. Altogether, including the X and/or HSY regions, 124 transcription units were annotated, including 50 functional pairs present in both the X and HSY. Ten HSY genes had functional homologs elsewhere in the papaya autosomal regions, suggesting movement of genes onto the HSY, whereas the X region had none. Sequence divergence between 70 transcripts shared by the X and HSY revealed two evolutionary strata in the X chromosome, corresponding to the two inversions on the HSY, the older of which evolved about 7.0 million years ago. Gene content differences between the HSY and X are greatest in the older stratum, whereas the gene content and order of the collinear regions are identical. Our findings support theoretical models of early sex chromosome evolution.

  3. Sequencing the genome of Marssonina brunnea reveals fungus-poplar co-evolution

    Directory of Open Access Journals (Sweden)

    Zhu Sheng

    2012-08-01

    Full Text Available Abstract Background The fungus Marssonina brunnea is a causal pathogen of Marssonina leaf spot that devastates poplar plantations by defoliating susceptible trees before normal fall leaf drop. Results We sequence the genome of M. brunnea with a size of 52 Mb assembled into 89 scaffolds, representing the first sequenced Dermateaceae genome. By inoculating this fungus onto a poplar hybrid clone, we investigate how M. brunnea interacts and co-evolves with its host to colonize poplar leaves. While a handful of virulence genes in M. brunnea, mostly from the LysM family, are detected to up-regulate during infection, the poplar down-regulates its resistance genes, such as nucleotide binding site domains and leucine rich repeats, in response to infection. From 10,027 predicted proteins of M. brunnea in a comparison with those from poplar, we identify four poplar transferases that stimulate the host to resist M. brunnea. These transferas-encoding genes may have driven the co-evolution of M. brunnea and Populus during the process of infection and anti-infection. Conclusions Our results from the draft sequence of the M. brunnea genome provide evidence for genome-genome interactions that play an important role in poplar-pathogen co-evolution. This knowledge could help to design effective strategies for controlling Marssonina leaf spot in poplar.

  4. The DNA sequence, annotation and analysis of human chromosome 3

    DEFF Research Database (Denmark)

    Muzny, Donna M; Scherer, Steven E; Kaul, Rajinder

    2006-01-01

    After the completion of a draft human genome sequence, the International Human Genome Sequencing Consortium has proceeded to finish and annotate each of the 24 chromosomes comprising the human genome. Here we describe the sequencing and analysis of human chromosome 3, one of the largest human chr...

  5. Sequence analysis of cereal sucrose synthase genes and isolation ...

    African Journals Online (AJOL)

    SERVER

    2007-10-18

    Oct 18, 2007 ... comparative analysis of grass genomes and as a source of beneficial genes for agriculture. Recent studies have shown that ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their ... Sequencing was carried out by Sanger dideoxy DNA sequencing method. Results.

  6. Sequencing and analysis of a genomic fragment provide an insight into the Dunaliella viridis genomic sequence.

    Science.gov (United States)

    Sun, Xiao-Ming; Tang, Yuan-Ping; Meng, Xiang-Zong; Zhang, Wen-Wen; Li, Shan; Deng, Zhi-Rui; Xu, Zheng-Kai; Song, Ren-Tao

    2006-11-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)(n) type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features.

  7. Expression of cassini, a murine gamma-satellite sequence conserved in evolution, is regulated in normal and malignant hematopoietic cells.

    Science.gov (United States)

    Arutyunyan, Anna; Stoddart, Sonia; Yi, Sun-ju; Fei, Fei; Lim, Min; Groffen, Paula; Feldhahn, Niklas; Groffen, John; Heisterkamp, Nora

    2012-08-23

    Acute lymphoblastic leukemia (ALL) cells treated with drugs can become drug-tolerant if co-cultured with protective stromal mouse embryonic fibroblasts (MEFs). We performed transcriptional profiling on these stromal fibroblasts to investigate if they were affected by the presence of drug-treated ALL cells. These mitotically inactivated MEFs showed few changes in gene expression, but a family of sequences of which transcription is significantly increased was identified. A sequence related to this family, which we named cassini, was selected for further characterization. We found that cassini was highly upregulated in drug-treated ALL cells. Analysis of RNAs from different normal mouse tissues showed that cassini expression is highest in spleen and thymus, and can be further enhanced in these organs by exposure of mice to bacterial endotoxin. Heat shock, but not other types of stress, significantly induced the transcription of this locus in ALL cells. Transient overexpression of cassini in human 293 embryonic kidney cells did not increase the cytotoxic or cytostatic effects of chemotherapeutic drugs but provided some protection. Database searches revealed that sequences highly homologous to cassini are present in rodents, apicomplexans, flatworms and primates, indicating that they are conserved in evolution. Moreover, CASSINI RNA was induced in human ALL cells treated with vincristine. Surprisingly, cassini belongs to the previously reported murine family of γ-satellite/major satellite DNA sequences, which were not known to be present in other species. Our results show that the transcription of at least one member of these sequences is regulated, suggesting that this has a function in normal and transformed immune cells. Expression of these sequences may protect cells when they are exposed to specific stress stimuli.

  8. Expression of cassini, a murine gamma-satellite sequence conserved in evolution, is regulated in normal and malignant hematopoietic cells

    Directory of Open Access Journals (Sweden)

    Arutyunyan Anna

    2012-08-01

    Full Text Available Abstract Background Acute lymphoblastic leukemia (ALL cells treated with drugs can become drug-tolerant if co-cultured with protective stromal mouse embryonic fibroblasts (MEFs. Results We performed transcriptional profiling on these stromal fibroblasts to investigate if they were affected by the presence of drug-treated ALL cells. These mitotically inactivated MEFs showed few changes in gene expression, but a family of sequences of which transcription is significantly increased was identified. A sequence related to this family, which we named cassini, was selected for further characterization. We found that cassini was highly upregulated in drug-treated ALL cells. Analysis of RNAs from different normal mouse tissues showed that cassini expression is highest in spleen and thymus, and can be further enhanced in these organs by exposure of mice to bacterial endotoxin. Heat shock, but not other types of stress, significantly induced the transcription of this locus in ALL cells. Transient overexpression of cassini in human 293 embryonic kidney cells did not increase the cytotoxic or cytostatic effects of chemotherapeutic drugs but provided some protection. Database searches revealed that sequences highly homologous to cassini are present in rodents, apicomplexans, flatworms and primates, indicating that they are conserved in evolution. Moreover, CASSINI RNA was induced in human ALL cells treated with vincristine. Surprisingly, cassini belongs to the previously reported murine family of γ-satellite/major satellite DNA sequences, which were not known to be present in other species. Conclusions Our results show that the transcription of at least one member of these sequences is regulated, suggesting that this has a function in normal and transformed immune cells. Expression of these sequences may protect cells when they are exposed to specific stress stimuli.

  9. Dissecting the roles of local packing density and longer-range effects in protein sequence evolution

    CERN Document Server

    Shahmoradi, Amir

    2015-01-01

    What are the structural determinants of protein sequence evolution? A number of site-specific structural characteristics have been proposed, most of which are broadly related to either the density of contacts or the solvent accessibility of individual residues. Most importantly, there has been disagreement in the literature over the relative importance of solvent accessibility and local packing density for explaining site-specific sequence variability in proteins. We show here that this discussion has been confounded by the definition of local packing density. The most commonly used measures of local packing, such as the contact number and the weighted contact number, represent by definition the combined effects of local packing density and longer-range effects. As an alternative, we here propose a truly local measure of packing density around a single residue, based on the Voronoi cell volume. We show that the Voronoi cell volume, when calculated relative to the geometric center of amino-acid side chains, be...

  10. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

    Science.gov (United States)

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-03-17

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies.

  11. Automated tools for comparative sequence analysis of genic regions using the GenePalette application.

    Science.gov (United States)

    Smith, Andrew F; Posakony, James W; Rebeiz, Mark

    2017-09-01

    Comparative sequence analysis methods, such as phylogenetic footprinting, represent one of the most effective ways to decode regulatory sequence functions based upon DNA sequence information alone. The laborious task of assembling orthologous sequences to perform these comparisons is a hurdle to these analyses, which is further aggravated by the relative paucity of tools for visualization of sequence comparisons in large genic regions. Here, we describe a second-generation implementation of the GenePalette DNA sequence analysis software to facilitate comparative studies of gene function and regulation. We have developed an automated module called OrthologGrabber (OG) that performs BLAT searches against the UC Santa Cruz genome database to identify and retrieve segments homologous to a region of interest. Upon acquisition, sequences are compared to identify high-confidence anchor-points, which are graphically displayed. The visualization of anchor-points alongside other DNA features, such as transcription factor binding sites, allows users to precisely examine whether a binding site of interest is conserved, even if the surrounding region exhibits poor sequence identity. This approach also aids in identifying orthologous segments of regulatory DNA, facilitating studies of regulatory sequence evolution. As with previous versions of the software, GenePalette 2.1 takes the form of a platform-independent, single-windowed interface that is simple to use. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. The Subclonal Structure and Genomic Evolution of Oral Squamous Cell Carcinoma Revealed by Ultra-deep Sequencing

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    complex subclonal architectures comprising distinct subclones only found in geographically distinct regions of the tumors. The metastatic potential of the tumor is acquired early in the tumor evolution, as indicated by the lymph node sharing the majority of the mutations with the tumor biopsies, while...... rarely acquiring novel mutations that are specific for the metastasis. Conclusion: Ultra-deep sequencing of multiple biopsies from OSCC and metastasis enables detection of subclonal structure and genomic evolution. The metastatic potential of OSCC is acquired early in the tumor evolution, and our results...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...

  13. Nearly Complete 28S rRNA Gene Sequences Confirm New Hypotheses of Sponge Evolution

    Science.gov (United States)

    Thacker, Robert W.; Hill, April L.; Hill, Malcolm S.; Redmond, Niamh E.; Collins, Allen G.; Morrow, Christine C.; Spicer, Lori; Carmack, Cheryl A.; Zappe, Megan E.; Pohlmann, Deborah; Hall, Chelsea; Diaz, Maria C.; Bangalore, Purushotham V.

    2013-01-01

    The highly collaborative research sponsored by the NSF-funded Assembling the Porifera Tree of Life (PorToL) project is providing insights into some of the most difficult questions in metazoan systematics. Our understanding of phylogenetic relationships within the phylum Porifera has changed considerably with increased taxon sampling and data from additional molecular markers. PorToL researchers have falsified earlier phylogenetic hypotheses, discovered novel phylogenetic alliances, found phylogenetic homes for enigmatic taxa, and provided a more precise understanding of the evolution of skeletal features, secondary metabolites, body organization, and symbioses. Some of these exciting new discoveries are shared in the papers that form this issue of Integrative and Comparative Biology. Our analyses of over 300 nearly complete 28S ribosomal subunit gene sequences provide specific case studies that illustrate how our dataset confirms new hypotheses of sponge evolution. We recovered monophyletic clades for all 4 classes of sponges, as well as the 4 major clades of Demospongiae (Keratosa, Myxospongiae, Haploscleromorpha, and Heteroscleromorpha), but our phylogeny differs in several aspects from traditional classifications. In most major clades of sponges, families within orders appear to be paraphyletic. Although additional sampling of genes and taxa are needed to establish whether this pattern results from a lack of phylogenetic resolution or from a paraphyletic classification system, many of our results are congruent with those obtained from 18S ribosomal subunit gene sequences and complete mitochondrial genomes. These data provide further support for a revision of the traditional classification of sponges. PMID:23748742

  14. Phylogeny, character evolution, and biogeography of Cuscuta (dodders; Convolvulaceae) inferred from coding plastid and nuclear sequences.

    Science.gov (United States)

    García, Miguel A; Costea, Mihai; Kuzmina, Maria; Stefanović, Saša

    2014-04-01

    The parasitic genus Cuscuta, containing some 200 species circumscribed traditionally in three subgenera, is nearly cosmopolitan, occurring in a wide range of habitats and hosts. Previous molecular studies, on subgenera Grammica and Cuscuta, delimited major clades within these groups. However, the sequences used were unalignable among subgenera, preventing the phylogenetic comparison across the genus. We conducted a broad phylogenetic study using rbcL and nrLSU sequences covering the morphological, physiological, and geographical diversity of Cuscuta. We used parsimony methods to reconstruct ancestral states for taxonomically important characters. Biogeographical inferences were obtained using statistical and Bayesian approaches. Four well-supported major clades are resolved. Two of them correspond to subgenera Monogynella and Grammica. Subgenus Cuscuta is paraphyletic, with section Pachystigma sister to subgenus Grammica. Previously described cases of strongly supported discordance between plastid and nuclear phylogenies, interpreted as reticulation events, are confirmed here and three new cases are detected. Dehiscent fruits and globose stigmas are inferred as ancestral character states, whereas the ancestral style number is ambiguous. Biogeographical reconstructions suggest an Old World origin for the genus and subsequent spread to the Americas as a consequence of one long-distance dispersal. Hybridization may play an important yet underestimated role in the evolution of Cuscuta. Our results disagree with scenarios of evolution (polarity) previously proposed for several taxonomically important morphological characters, and with their usage and significance. While several cases of long-distance dispersal are inferred, vicariance or dispersal to adjacent areas emerges as the dominant biogeographical pattern.

  15. Deep sequencing identifies genetic heterogeneity and recurrent convergent evolution in chronic lymphocytic leukemia.

    Science.gov (United States)

    Ojha, Juhi; Ayres, Jackline; Secreto, Charla; Tschumper, Renee; Rabe, Kari; Van Dyke, Daniel; Slager, Susan; Shanafelt, Tait; Fonseca, Rafael; Kay, Neil E; Braggio, Esteban

    2015-01-15

    Recent high-throughput sequencing and microarray studies have characterized the genetic landscape and clonal complexity of chronic lymphocytic leukemia (CLL). Here, we performed a longitudinal study in a homogeneously treated cohort of 12 patients, with sequential samples obtained at comparable stages of disease. We identified clonal competition between 2 or more genetic subclones in 70% of the patients with relapse, and stable clonal dynamics in the remaining 30%. By deep sequencing, we identified a high reservoir of genetic heterogeneity in the form of several driver genes mutated in small subclones underlying the disease course. Furthermore, in 2 patients, we identified convergent evolution, characterized by the combination of genetic lesions affecting the same genes or copy number abnormality in different subclones. The phenomenon affects multiple CLL putative driver abnormalities, including mutations in NOTCH1, SF3B1, DDX3X, and del(11q23). This is the first report documenting convergent evolution as a recurrent event in the CLL genome. Furthermore, this finding suggests the selective advantage of specific combinations of genetic lesions for CLL pathogenesis in a subset of patients. © 2015 by The American Society of Hematology.

  16. Impacts of WIMP dark matter upon stellar evolution: main-sequence stars

    CERN Document Server

    Scott, Pat; Edsjo, Joakim

    2008-01-01

    The presence of large amounts of WIMP dark matter in stellar cores has been shown to have significant effects upon models of stellar evolution. We present a series of detailed grids of WIMP-influenced stellar models for main sequence stars, computed using the DarkStars code. We describe the changes in stellar structure and main sequence evolution which occur for masses ranging from 0.3 to 2.0 solar masses and metallicities from Z = 0.0003-0.02, as a function of the rate of energy injection by WIMPs. We then go on to show what rates of energy injection can be obtained using realistic orbital parameters for stars near supermassive black holes, including detailed considerations of dark matter halo velocity and density profiles. Capture and annihilation rates are strongly boosted when stars follow elliptical rather than circular orbits, causing WIMP annihilation to provide up to 100 times the energy of hydrogen fusion in stars at the Galactic centre.

  17. Sperm competition shapes gene expression and sequence evolution in the ocellated wrasse.

    Science.gov (United States)

    Dean, Rebecca; Wright, Alison E; Marsh-Rollo, Susan E; Nugent, Bridget M; Alonzo, Suzanne H; Mank, Judith E

    2017-01-01

    Gene expression differences between males and females often underlie sexually dimorphic phenotypes, and the expression levels of genes that are differentially expressed between the sexes are thought to respond to sexual selection. Most studies on the transcriptomic response to sexual selection treat sexual selection as a single force, but postmating sexual selection in particular is expected to specifically target gonadal tissue. The three male morphs of the ocellated wrasse (Symphodus ocellatus) make it possible to test the role of postmating sexual selection in shaping the gonadal transcriptome. Nesting males hold territories and have the highest reproductive success, yet we detected feminization of their gonadal gene expression compared to satellite males. Satellite males are less brightly coloured and experience more intense sperm competition than nesting males. In line with postmating sexual selection affecting gonadal gene expression, we detected a more masculinized expression profile in satellites. Sneakers are the lowest quality males and showed both de-masculinization and de-feminization of gene expression. We also detected higher rates of gene sequence evolution of male-biased genes compared to unbiased genes, which could at least in part be explained by positive selection. Together, these results reveal the potential for postmating sexual selection to drive higher rates of gene sequence evolution and shape the gonadal transcriptome profile. © 2016 John Wiley & Sons Ltd.

  18. [Tabular excel editor for analysis of aligned nucleotide sequences].

    Science.gov (United States)

    Demkin, V V

    2010-01-01

    Excel platform was used for transition of results of multiple aligned nucleotide sequences obtained using the BLAST network service to the form appropriate for visual analysis and editing. Two macros operators for MS Excel 2007 were constructed. The array of aligned sequences transformed into Excel table and processed using macros operators is more appropriate for analysis than initial html data.

  19. Molecular cloning, sequence analysis and tissue expression of ...

    African Journals Online (AJOL)

    Proofreader

    2017-10-01

    Oct 1, 2017 ... Molecular cloning, sequence analysis and tissue expression of bovine imprinted. ASCL2 gene. O. Bamidele1 ... The objectives of this study were to perform in silico analysis of the genomic messenger RNA (mRNA), and protein sequences ...... Non-linear dynamics of nonsynonymous (dN) and synonymous ...

  20. Stratigraphic sequence analysis of the Antler foreland

    Energy Technology Data Exchange (ETDEWEB)

    Silberling, N.J.; Nichols, K.M.; Macke, D.L. (Geological Survey, Denver, CO (United States))

    1993-04-01

    Mid-Upper Devonian to Upper Mississippian strata in western Utah were deposited in the distal Antler foreland. They record lateral and vertical changes in depositional environments that define five successive stratigraphic sequences, each representing a third-order transgressive-regressive cycle. In ascending order, these sequences are informally named the Langenheim (LA) of late Frasnian to mid-Famennian age, the Gutschick (GU) of late Famennian to early Kinderhookian age, the Morris (MO) of late Kinderhookian age; the Sadlick (SA) of Osagean to early Meramecian age, and the Maughan (MA) of mid-Meramecian to Chesterian age. MO is widespread and recognized within carbonate rocks of the Fitchville Formation and Joana Limestone. SA formed in concert with and to the east and south of the Wendover foreland high; the Delle phosphatic event marks maximum marine flooding during SA deposition. The transgressive systems tract of MA includes rhythmic-bedded limestone in the upper part of the Deseret Limestone in west-central Utah and, farther west, the hypoxic limestone and black shale of the Skunk Spring Limestone Bed and part of the overlying Chainman Shale. Traced westward into Nevada, MA first oversteps SA and then MO. Lithostratigraphic correlation of these sequences still farther west into the Eureka thrust belt (ETB) could mean that the youngest strata truncated by the Roberts Mountains thrust belong to the MA and that this thrust is simply part of the post-Mississippian ETB. However, some strata in central Nevada that lithically resemble those of the MA are paleontologically dated as Early Mississippian, the age of sequences overstepped by MA not far to the east. Thus, at least some imbricates of the ETB may contain a sequence stratigraphy which reflects local tectonic control.

  1. Comparative molecular phylogeny and evolution of sex chromosome DNA sequences in the family Canidae (Mammalia: Carnivora).

    Science.gov (United States)

    Tsubouchi, Ayako; Fukui, Daisuke; Ueda, Miya; Tada, Kazumi; Toyoshima, Shouji; Takami, Kazutoshi; Tsujimoto, Tsunenori; Uraguchi, Kohji; Raichev, Evgeniy; Kaneko, Yayoi; Tsunoda, Hiroshi; Masuda, Ryuichi

    2012-03-01

    To investigate the molecular phylogeny and evolution of the family Canidae, nucleotide sequences of the zinc-finger-protein gene on the Y chromosome (ZFY, 924-1146 bp) and its homologous gene on the X chromosome (ZFX, 834-839 bp) for twelve canid species were determined. The phylogenetic relationships among species reconstructed by the paternal ZFY sequences closely agreed with those by mtDNA and autosomal DNA trees in previous reports, and strongly supported the phylogenetic affinity between the wolf-like canids clade and the South American canids clade. However, the branching order of some species differed between phylogenies of ZFY and ZFX genes: Cuon alpinus and Canis mesomelas were included in the wolf-like canid clades in the ZFY tree, whereas both species were clustered in a group of Chrysocyon brachyurus and Speothos venaticus in the ZFX tree. The topology difference between ZFY and ZFX trees may have resulted from the two-times higher substitution rate of the former than the latter, which was clarified in the present study. In addition, two types of transposable element sequence (SINE-I and SINE-II) were found to occur in the ZFY final intron of the twelve canid species examined. Because the SINE-I sequences were shared by all the species, they may have been inserted into the ZFY of the common ancestor before species radiation in Canidae. By contract, SINE-II found in only Canis aureus could have been inserted into ZFY independently after the speciation. The molecular diversity of SINE sequences of Canidae reflects evolutionary history of the species radiation.

  2. Sequence analysis of Schmallenberg virus genomes detected in Hungary.

    Science.gov (United States)

    Fehér, Enikő; Marton, Szilvia; Tóth, Ádám György; Ursu, Krisztina; Wernike, Kerstin; Beer, Martin; Dán, Ádám; Bányai, Krisztián

    2017-12-01

    Since its emergence near the German-Dutch border in 2011, Schmallenberg virus (SBV) has been identified in many European countries. In this study, we determined the complete coding sequence of seven Hungarian SBV genomes to expand our knowledge about the genetic diversity of circulating field strains. The samples originated from the first case, an aborted cattle fetus without malformation collected in 2012, and from the blood samples of six adult cattle in 2014. The Hungarian SBV sequences shared ≥99.3% nucleotide (nt) and ≥97.8% amino acid (aa) identity with each other, and ≥98.9 nt and ≥96.7% aa identity with reference strains. Although phylogenetic analyses showed low resolution in general, the M sequences of cattle and sheep origin SBV strains seemed to cluster on different branches. Both common and unique mutation sites were observed in different groups of sequences that might help understanding the evolution of emerging SBV strains.

  3. Analysis of Expressed Sequence Tags (EST) in Date Palm.

    Science.gov (United States)

    Al-Faifi, Sulieman A; Migdadi, Hussein M; Algamdi, Salem S; Khan, Mohammad Altaf; Al-Obeed, Rashid S; Ammar, Megahed H; Jakse, Jerenj

    2017-01-01

    Expressed sequence tags (EST) were generated from a normalized cDNA library of the date palm Sukkari cv. to understand the high-quality and better field performance of this well-known commercial cultivar. A total of 6943 high-quality ESTs were generated, out of them 6671 are submitted to the GenBank dbEST (LIBEST_028537). The generated ESTs were assembled into 6362 unigenes, consisting of 494 (14.4%) contigs and 5868 (84.53%) singletons. The functional annotation shows that the majority of the ESTs are associated with binding (44%), catalytic (40%), transporter (5%), and structural molecular (5%) activities. The blastx results show that 73% of unigenes are significantly similar to known plant genes and 27% are novel. The latter could be of particular interest in date palm genetic studies. Further analysis shows that some ESTs are categorized as stress/defense- and fruit development-related genes. These newly generated ESTs could significantly enhance date palm EST databases in the public domain and are available to scientists and researchers across the globe. This knowledge will facilitate the discovery of candidate genes that govern important developmental and agronomical traits in date palm. It will provide important resources for developing genetic tools, comparative genomics, and genome evolution among date palm cultivars.

  4. Sequence analysis of 17 NRXN1 deletions

    DEFF Research Database (Denmark)

    Hoeffding, Louise Kristine Enggaard; Hansen, Thomas; Ingason, Andrés

    2014-01-01

    into the molecular mechanisms governing such genomic rearrangements may increase our understanding of disease pathology and evolutionary processes. Here we analyse 17 carriers of non-recurrent deletions in the NRXN1 gene, which have been associated with neurodevelopmental disorders, e.g. schizophrenia, autism......Genome instability plays fundamental roles in human evolution and phenotypic variation within our population. This instability leads to genomic rearrangements that are involved in a wide variety of human disorders, including congenital and neurodevelopmental disorders, and cancers. Insight...

  5. Genome DNA Sequence Variation, Evolution, and Function in Bacteria and Archaea.

    Science.gov (United States)

    Nishida, Hiromi

    2013-01-01

    Comparative genomics has revealed that variations in bacterial and archaeal genome DNA sequences cannot be explained by only neutral mutations. Virus resistance and plasmid distribution systems have resulted in changes in bacterial and archaeal genome sequences during evolution. The restriction-modification system, a virus resistance system, leads to avoidance of palindromic DNA sequences in genomes. Clustered, regularly interspaced, short palindromic repeats (CRISPRs) found in genomes represent yet another virus resistance system. Comparative genomics has shown that bacteria and archaea have failed to gain any DNA with GC content higher than the GC content of their chromosomes. Thus, horizontally transferred DNA regions have lower GC content than the host chromosomal DNA does. Some nucleoid-associated proteins bind DNA regions with low GC content and inhibit the expression of genes contained in those regions. This form of gene repression is another type of virus resistance system. On the other hand, bacteria and archaea have used plasmids to gain additional genes. Virus resistance systems influence plasmid distribution. Interestingly, the restriction-modification system and nucleoid-associated protein genes have been distributed via plasmids. Thus, GC content and genomic signatures do not reflect bacterial and archaeal evolutionary relationships.

  6. Sequences from first settlers reveal rapid evolution in Icelandic mtDNA pool.

    Directory of Open Access Journals (Sweden)

    Agnar Helgason

    2009-01-01

    Full Text Available A major task in human genetics is to understand the nature of the evolutionary processes that have shaped the gene pools of contemporary populations. Ancient DNA studies have great potential to shed light on the evolution of populations because they provide the opportunity to sample from the same population at different points in time. Here, we show that a sample of mitochondrial DNA (mtDNA control region sequences from 68 early medieval Icelandic skeletal remains is more closely related to sequences from contemporary inhabitants of Scotland, Ireland, and Scandinavia than to those from the modern Icelandic population. Due to a faster rate of genetic drift in the Icelandic mtDNA pool during the last 1,100 years, the sequences carried by the first settlers were better preserved in their ancestral gene pools than among their descendants in Iceland. These results demonstrate the inferential power gained in ancient DNA studies through the application of population genetics analyses to relatively large samples.

  7. Tracing Star Formation and Molecular Cloud Evolution with Pre-main Sequence Stars in the SMC

    Science.gov (United States)

    Johnson, L. Clifton; SMIDGE Team

    2018-01-01

    The Southwest Bar region in the Small Magellanic Cloud (SMC) contains star-forming molecular clouds sampling a wide range of evolutionary states: from quiescent pre-star-forming regions to evolved HII region hosts. We use deep, panchromatic, high spatial resolution Hubble Space Telescope imaging obtained as part of the SMIDGE survey (PI: K. Sandstrom) to identify young, pre-main sequence stars that trace recent and ongoing star formation within these clouds. We characterize a color-selected sample (and Hα line-emitting subsample) of pre-main sequence stars via SED fitting and analyze their association with the local ISM, inferred from observations of CO and dust emission. These low-mass stars serve as robust star formation tracers not tied to massive stars (e.g., Hα-based star formation rate estimates) in SMC star-forming regions, where low dust-to-gas ratios allow optical detections even in gas-rich embedded regions. We demonstrate pre-main sequence stars' ability to trace molecular cloud evolution within the Southwest Bar and across the SMC, and discuss future synergies between optical Hubble Space Telescope observations and near/mid-IR James Webb Space Telescope observations.

  8. Evolution, homology conservation, and identification of unique sequence signatures in GH19 family chitinases.

    Science.gov (United States)

    Udaya Prakash, N A; Jayanthi, M; Sabarinathan, R; Kangueane, P; Mathew, Lazar; Sekar, K

    2010-05-01

    The discovery of GH (Glycoside Hydrolase) 19 chitinases in Streptomyces sp. raises the possibility of the presence of these proteins in other bacterial species, since they were initially thought to be confined to higher plants. The present study mainly concentrates on the phylogenetic distribution and homology conservation in GH19 family chitinases. Extensive database searches are performed to identify the presence of GH19 family chitinases in the three major super kingdoms of life. Multiple sequence alignment of all the identified GH19 chitinase family members resulted in the identification of globally conserved residues. We further identified conserved sequence motifs across the major sub groups within the family. Estimation of evolutionary distance between the various bacterial and plant chitinases are carried out to better understand the pattern of evolution. Our study also supports the horizontal gene transfer theory, which states that GH19 chitinase genes are transferred from higher plants to bacteria. Further, the present study sheds light on the phylogenetic distribution and identifies unique sequence signatures that define GH19 chitinase family of proteins. The identified motifs could be used as markers to delineate uncharacterized GH19 family chitinases. The estimation of evolutionary distance between chitinase identified in plants and bacteria shows that the flowering plants are more related to chitinase in actinobacteria than that of identified in purple bacteria. We propose a model to elucidate the natural history of GH19 family chitinases.

  9. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    Science.gov (United States)

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden.

  10. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  11. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  12. Detailed Analysis of a Multiplet Earthquake Sequence

    Science.gov (United States)

    Iglesias, A.; Singh, S. K.; Garduño, V. H.

    2014-12-01

    The Mexican National Seismological Service reported a sequence of four small earthquakes (2.5 < M < 3.0) occurring in Morelia, a city of 1,000,000, which is the capital city of Michoacán State. A careful revision of the records from a three-component broad band station, located ~10 km far from the earthquakes, showed a sequence of 7 earthquakes in a period of about 36 hours. Waveforms are remarkably similar between them and they may be considered as a "multiplet". In this work, we use the records from the broad-band station and a coda wave interferometry based methodology to obtain the relative distance between pair of events. The 21 inter-event distances obtained are considered as over-determined system for the relative positions between events. A non-linear damped scheme is used to solve the over-determined system and to obtain the spatial distribution of the 7 earthquakes. Results show (1) distances between events are < 200 m, and (2) the sequence has an approximate linear distribution.

  13. Parameters of proteome evolution from histograms of amino-acid sequence identities of paralogous proteins

    Directory of Open Access Journals (Sweden)

    Yan Koon-Kiu

    2007-11-01

    Full Text Available Abstract Background The evolution of the full repertoire of proteins encoded in a given genome is mostly driven by gene duplications, deletions, and sequence modifications of existing proteins. Indirect information about relative rates and other intrinsic parameters of these three basic processes is contained in the proteome-wide distribution of sequence identities of pairs of paralogous proteins. Results We introduce a simple mathematical framework based on a stochastic birth-and-death model that allows one to extract some of this information and apply it to the set of all pairs of paralogous proteins in H. pylori, E. coli, S. cerevisiae, C. elegans, D. melanogaster, and H. sapiens. It was found that the histogram of sequence identities p generated by an all-to-all alignment of all protein sequences encoded in a genome is well fitted with a power-law form ~ p-γ with the value of the exponent γ around 4 for the majority of organisms used in this study. This implies that the intra-protein variability of substitution rates is best described by the Gamma-distribution with the exponent α ≈ 0.33. Different features of the shape of such histograms allow us to quantify the ratio between the genome-wide average deletion/duplication rates and the amino-acid substitution rate. Conclusion We separately measure the short-term ("raw" duplication and deletion rates rdup∗ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOCai3aa0baaSqaaiabbsgaKjabbwha1jabbchaWbqaaiabgEHiQaaaaaa@3283@, rdel∗ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOCai3aa0baaSqaaiabbsga

  14. Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae

    Directory of Open Access Journals (Sweden)

    Yuan Huang

    2017-06-01

    Full Text Available Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in

  15. Evolution of beta-amylase: patterns of variation and conservation in subfamily sequences in relation to parsimony mechanisms.

    Science.gov (United States)

    Pujadas, G; Ramírez, F M; Valero, R; Palau, J

    1996-08-01

    Soybean and sweet potato beta-amylases are structured as alpha/beta barrels and the same kind of folding may account for all known beta-amylases. We provide a comprehensive analysis of both protein and DNA (coding region) sequences of beta-amylases. The aim of the study is to contribute to the knowledge of the evolutionary molecular relationships among all known beta-amylases. Our approach combines the identification of the putative eightfold structural core formed by beta-strands with a complete multi-alignment analysis of all known sequences. Comparing putative beta-amylase (alpha/beta)8 cores from plants and microorganisms, two differentiated versions of residues at the packing sites, and a unique set of eight identical residues at the C-terminal catalytical site are observed, indicating early evolutionary divergence and absence of localized three-dimensional evolution, respectively. A new analytical approach has been developed in order to work out conserved motifs for beta-amylases, mostly related with the enzyme activity. This approach appears useful as a new routine to find sets of motifs (each set being known as a fingerprint) in protein families. We demonstrate that the evolutionary mechanism for beta-amylases is a combination of parsimonious divergence at three distinguishable rates in relation to the functional signatures, the barrel scaffold, and alpha-helix-containing loops.

  16. Large Scale Sequencing of Dothideomycetes Provides Insights into Genome Evolution and Adaptation

    Energy Technology Data Exchange (ETDEWEB)

    Haridas, Sajeet; Crous, Pedro; Binder, Manfred; Spatafora, Joseph; Grigoriev, Igor

    2015-03-16

    Dothideomycetes is the largest and most diverse class of ascomycete fungi with 23 orders 110 families, 1300 genera and over 19,000 known species. We present comparative analysis of 70 Dothideomycete genomes including over 50 that we sequenced and are as yet unpublished. This extensive sampling has almost quadrupled the previous study of 18 species and uncovered a 10 fold range of genome sizes. We were able to clarify the phylogenetic positions of several species whose origins were unclear in previous morphological and sequence comparison studies. We analyzed selected gene families including proteases, transporters and small secreted proteins and show that major differences in gene content is influenced by speciation.

  17. Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

    Science.gov (United States)

    Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

    2010-03-26

    Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.

  18. Genomic analysis of expressed sequence tags in American black bear Ursus americanus

    Directory of Open Access Journals (Sweden)

    Tøien Øivind

    2010-03-01

    Full Text Available Abstract Background Species of the bear family (Ursidae are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST resource for the American black bear (Ursus americanus. Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN, cysteine glycine-rich protein 3 (CSRP3 and Troponin I type 3 (TNNI3, are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.

  19. Genome Evolution and Meiotic Maps by Massively Parallel DNA Sequencing: Spotted Gar, an Outgroup for the Teleost Genome Duplication

    Science.gov (United States)

    Amores, Angel; Catchen, Julian; Ferrara, Allyse; Fontenot, Quenton; Postlethwait, John H.

    2011-01-01

    Genomic resources for hundreds of species of evolutionary, agricultural, economic, and medical importance are unavailable due to the expense of well-assembled genome sequences and difficulties with multigenerational studies. Teleost fish provide many models for human disease but possess anciently duplicated genomes that sometimes obfuscate connectivity. Genomic information representing a fish lineage that diverged before the teleost genome duplication (TGD) would provide an outgroup for exploring the mechanisms of evolution after whole-genome duplication. We exploited massively parallel DNA sequencing to develop meiotic maps with thrift and speed by genotyping F1 offspring of a single female and a single male spotted gar (Lepisosteus oculatus) collected directly from nature utilizing only polymorphisms existing in these two wild individuals. Using Stacks, software that automates the calling of genotypes from polymorphisms assayed by Illumina sequencing, we constructed a map containing 8406 markers. RNA-seq on two map-cross larvae provided a reference transcriptome that identified nearly 1000 mapped protein-coding markers and allowed genome-wide analysis of conserved synteny. Results showed that the gar lineage diverged from teleosts before the TGD and its genome is organized more similarly to that of humans than teleosts. Thus, spotted gar provides a critical link between medical models in teleost fish, to which gar is biologically similar, and humans, to which gar is genomically similar. Application of our F1 dense mapping strategy to species with no prior genome information promises to facilitate comparative genomics and provide a scaffold for ordering the numerous contigs arising from next generation genome sequencing. PMID:21828280

  20. Post main sequence evolution of icy minor planets: water retention and white dwarf pollution

    Science.gov (United States)

    Malamud, Uri; Perets, Hagai

    2017-06-01

    We investigate the evolution of icy minor planets from the moment of their birth and through the all evolutionary stages of their host stars, including the main sequence, red giant branch and asymptotic giant branch phases. We then asses the degree of water retention in planetary systems around white dwarf, as a function of various parameters. We consider progenitor stars of different masses and metallicities. We also consider minor planets of various sizes, initial orbital distances, compositions and formation times. Our results indicate that water can survive to the white dwarf stage in a variety of circumstances, especially around G, F, A and even some B type stars. We discuss the significance of water retention with respect to white dwarf pollution and also for planet habitability.

  1. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    In Pakistan, more than 10 million people are living with hepatitis C virus (HCV) with high morbidity and mortality. The aims of the present study are to report HCV core gene sequences from Pakistani population and perform their sequence comparison/phylogenetic analysis. The core gene of HCV has been cloned from six ...

  2. Cloning and sequence analysis of the Antheraea pernyi ...

    Indian Academy of Sciences (India)

    Unknown

    A genomic library was generated using HindIII and the positive clones were sequenced and analysed. The gp64 gene, encoding the baculovirus envelope protein GP64, was found in an insert. The nucleotide sequence analysis indicated that the AnpeNPV gp64 gene consists of a 1530 nucleotide open reading frame ...

  3. Cloning and sequence analysis of benzo-a-pyreneinducible ...

    African Journals Online (AJOL)

    Cloning and sequence analysis of benzo-a-pyreneinducible cytochrome P450 1A in Nile tilapia ( Oreochromis niloticus ) ... The full-length cDNA was 2530 bp long and contained an open reading frame of 1566 bp encoding a protein of 521 amino acids and a stop codon. The sequence exhibited 5' and 3' noncoding

  4. Sequence analysis corresponding to the PPE and PE proteins in ...

    Indian Academy of Sciences (India)

    Amino acid sequence analysis corresponding to the PPE proteins in H37Rv and CDC1551 strains of the Mycobacterium tuberculosis genomes resulted in the identification of a previously uncharacterized 225 amino acidresidue common region in 22 proteins. The pairwise sequence identities were as low as 18%.

  5. Biological sequence analysis: probabilistic models of proteins and nucleic acids

    National Research Council Canada - National Science Library

    Durbin, Richard

    1998-01-01

    ... analysis methods are now based on principles of probabilistic modelling. Examples of such methods include the use of probabilistically derived score matrices to determine the significance of sequence alignments, the use of hidden Markov models as the basis for profile searches to identify distant members of sequence families, and the inference...

  6. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    STORAGESEVER

    2010-07-19

    Jul 19, 2010 ... In Pakistan, more than 10 million people are living with hepatitis C virus (HCV) with high morbidity and mortality. The aims of the present study are to report HCV core gene sequences from Pakistani population and perform their sequence comparison/phylogenetic analysis. The core gene of HCV has.

  7. Inter Simple Sequence Repeat (ISSR) analysis of wild and cultivated ...

    African Journals Online (AJOL)

    Inter Simple Sequence Repeat (ISSR) analysis of wild and cultivated rice species from Ethiopia. ... African Journal of Biotechnology ... The genetic diversity of three wild rice populations of Ethiopia along with three cultivated rice populations were studied using Inter simple sequence repeats (ISSRs) as a molecular marker.

  8. Sequence analysis corresponding to the PPE and PE proteins in ...

    Indian Academy of Sciences (India)

    Unknown

    Amino acid sequence analysis corresponding to the PPE proteins in H37Rv and CDC1551 strains of the Myco- bacterium tuberculosis genomes resulted in the identification of a previously uncharacterized 225 amino acid- residue common region in 22 proteins. The pairwise sequence identities were as low as 18%.

  9. A deep sequencing analysis of transcriptomes and the development ...

    Indian Academy of Sciences (India)

    [Liu C., Fan B., Cao Z., Su Q., Wang Y., Zhang Z., Wu J. and Tian J. 2016 A deep sequencing analysis of transcriptomes and the development of EST-SSR markers in ... Further, through a deep mRNA sequencing (RNA-seq) we can get a chance to discover ...... Wego: a web tool for plotting go annotations. Nucleic Acids Res.

  10. The complete genome sequence of Lactobacillus bulgaricus reveals extensive and ongoing reductive evolution.

    Science.gov (United States)

    van de Guchte, M; Penaud, S; Grimaldi, C; Barbe, V; Bryson, K; Nicolas, P; Robert, C; Oztas, S; Mangenot, S; Couloux, A; Loux, V; Dervyn, R; Bossy, R; Bolotin, A; Batto, J-M; Walunas, T; Gibrat, J-F; Bessières, P; Weissenbach, J; Ehrlich, S D; Maguin, E

    2006-06-13

    Lactobacillus delbrueckii ssp. bulgaricus (L. bulgaricus) is a representative of the group of lactic acid-producing bacteria, mainly known for its worldwide application in yogurt production. The genome sequence of this bacterium has been determined and shows the signs of ongoing specialization, with a substantial number of pseudogenes and incomplete metabolic pathways and relatively few regulatory functions. Several unique features of the L. bulgaricus genome support the hypothesis that the genome is in a phase of rapid evolution. (i) Exceptionally high numbers of rRNA and tRNA genes with regard to genome size may indicate that the L. bulgaricus genome has known a recent phase of important size reduction, in agreement with the observed high frequency of gene inactivation and elimination; (ii) a much higher GC content at codon position 3 than expected on the basis of the overall GC content suggests that the composition of the genome is evolving toward a higher GC content; and (iii) the presence of a 47.5-kbp inverted repeat in the replication termination region, an extremely rare feature in bacterial genomes, may be interpreted as a transient stage in genome evolution. The results indicate the adaptation of L. bulgaricus from a plant-associated habitat to the stable protein and lactose-rich milk environment through the loss of superfluous functions and protocooperation with Streptococcus thermophilus.

  11. Sequence similarity network reveals the imprints of major diversification events in the evolution of microbial life

    Directory of Open Access Journals (Sweden)

    Shu eCheng

    2014-11-01

    Full Text Available Ancient transitions, such as between life that evolved in a reducing versus an oxidizing atmosphere precipitated by the Great Oxygenation Event (GOE ca. 2.4 billion years ago, fundamentally altered the space in which prokaryotes could derive metabolic energy. Despite fundamental changes in Earth’s redox state, there are very few comprehensive, proteome-wide analyses about the effects of these changes on gene content and evolution. Here, using a pan-proteome sequence similarity network applied to broadly sampled lifestyles of 84 prokaryotes that were categorized into four different redox groups (i.e., methanogens, obligate anaerobes, facultative anaerobes, and obligate aerobes, we reconstructed the genetic inventory of major respiratory communities. We show that a set of putative core homologs that is highly conserved in prokaryotic proteomes is characterized by the loss of canonical network connections and low conductance that correlates with differences in respiratory phenotypes. We suggest these different network patterns observed for different respiratory communities could be explained by two major evolutionary diversification events in the history of microbial life. The first event (M is a divergence between methanogenesis and other anaerobic lifestyles in prokaryotes (archaebacteria and eubacteria. The second diversification event (OX is from anaerobic to aerobic lifestyles that left a proteome-wide footprint among prokaryotes. Additional analyses revealed that oxidoreductase evolution played a central role in these two diversification events. Distinct cofactor binding domains were frequently recombined, allowing these enzymes to utilize increasingly oxidized substrates with high specificity.

  12. Strategy for the sequence analysis of heparin.

    Science.gov (United States)

    Liu, J; Desai, U R; Han, X J; Toida, T; Linhardt, R J

    1995-12-01

    The versatile biological activities of proteoglycans are mainly mediated by their glycosaminoglycan (GAG) components. Unlike proteins and nucleic acids, no satisfactory method for sequencing GAGs has been developed. This paper describes a strategy to sequence the GAG chains of heparin. Heparin, prepared from animal tissue, and processed by proteinases and endoglucuronidases, is 90% GAG heparin and 10% peptidoglycan heparin (containing small remnants of core protein). Raw porcine mucosal heparin was labelled on the amino termini of these core protein remnants with a hydrophobic, fluorescent tag [N-4-(6-dimethylamino-2-benzofuranyl) phenyl (NDBP)-isothiocyanate]. Enrichment of the NDBP-heparin using phenyl-Sepharose chromatography, followed by treatment with a mixture of heparin lyase I and III, resulted in a single NDBP-linkage region tetrasaccharide, which was characterized as deltaUAp(1-->3)-beta-D-Galp(1-->3)-beta-D-Galp(1-->4)-beta-Xylp -(1-->O-Ser-NDBP (deltaUAp is 4-deoxy-alpha-L-threo-hex-4-enopyranosyl uronic acid). Several NDBP-octasaccharides were isolated when NDBP-heparin was treated with only heparin lyase I. The structure of one of these NDBP-octasaccharides, deltaUAp2S(1-->4)-alpha-D-GlcNpAc(1-->4)-alpha-L-IdoAp (1-->4)-alpha-D-GlcNpAc6S(1-->4)-beta-D-GlcAp(1-->3)-beta-D- Galp(1-->3)-beta-D-Galp(1-->4)-beta-Xylp-(1-->O-Ser NDBP (S is sulphate, Ac is acetate), was determined by 1H-NMR and enzymatic methods. Enriched NDBP-heparin was treated with lithium hydroxide to release heparin, and the GAG chain was then labelled at xylose with 7-amino-1,3-naphthalene disulphonic acid (AGA). The resulting AGA-Xyl-heparin was sequenced on gradient PAGE using heparin lyase I and heparin lyase III. A predominant sequence in heparin at the protein core attachment site was deduced to be -D-GlcNp2S6S(or 6OH)(1-->4)-alpha-L-IdoAp2S-(1-->4)-alpha-D-GlcNp2S6S (or60H) (1-->4)-alpha-L-IdoAp2S(1-->4)-alpha-D-GlcNp2S6S( or 6OH)(1-->4)-alpha-L-IdoAp2S(1-->4)-alpha-D-GlcNpAc (1

  13. RNA Sequencing Analysis of Salivary Extracellular RNA.

    Science.gov (United States)

    Majem, Blanca; Li, Feng; Sun, Jie; Wong, David T W

    2017-01-01

    Salivary biomarkers for disease detection, diagnostic and prognostic assessments have become increasingly well established in recent years. In this chapter we explain the current leading technology that has been used to characterize salivary non-coding RNAs (ncRNAs) from the extracellular RNA (exRNA) fraction: HiSeq from Illumina® platform for RNA sequencing. Therefore, the chapter is divided into two main sections regarding the type of the library constructed (small and long ncRNA libraries), from saliva collection, RNA extraction and quantification to cDNA library generation and corresponding QCs. Using these invaluable technical tools, one can identify thousands of ncRNA species in saliva. These methods indicate that salivary exRNA provides an efficient medium for biomarker discovery of oral and systemic diseases.

  14. Real analysis via sequences and series

    CERN Document Server

    Little, Charles H C; van Brunt, Bruce

    2015-01-01

    This text gives a rigorous treatment of the foundations of calculus. In contrast to more traditional approaches, infinite sequences and series are placed at the forefront. The approach taken has not only the merit of simplicity, but students are well placed to understand and appreciate more sophisticated concepts in advanced mathematics. The authors mitigate potential difficulties in mastering the material by motivating  definitions, results, and proofs. Simple examples  are provided to  illustrate new material and exercises are included at the end of most sections. Noteworthy topics include: an extensive discussion of convergence tests for infinite series, Wallis’s formula and Stirling’s formula, proofs of the irrationality of π and e, and a treatment of Newton’s method as a special instance of finding fixed points of iterated functions.

  15. Mesoscopic Model for Free Energy Landscape Analysis of DNA sequences

    CERN Document Server

    Tapia-Rojo, R; Mazo, J J; Falo, F; 10.1103/PhysRevE.86.021908

    2012-01-01

    A mesoscopic model which allows us to identify and quantify the strength of binding sites in DNA sequences is proposed. The model is based on the Peyrard-Bishop-Dauxois model for the DNA chain coupled to a Brownian particle which explores the sequence interacting more importantly with open base pairs of the DNA chain. We apply the model to promoter sequences of different organisms. The free energy landscape obtained for these promoters shows a complex structure that is strongly connected to their biological behavior. The analysis method used is able to quantify free energy differences of sites within genome sequences.

  16. The complete genome sequence of Xanthomonas albilineans provides new insights into the reductive genome evolution of the xylem-limited Xanthomonadaceae

    Directory of Open Access Journals (Sweden)

    Szurek Boris

    2009-12-01

    Full Text Available Abstract Background The Xanthomonadaceae family contains two xylem-limited plant pathogenic bacterial species, Xanthomonas albilineans and Xylella fastidiosa. X. fastidiosa was the first completely sequenced plant pathogen. It is insect-vectored, has a reduced genome and does not possess hrp genes which encode a Type III secretion system found in most plant pathogenic bacteria. X. fastidiosa was excluded from the Xanthomonas group based on phylogenetic analyses with rRNA sequences. Results The complete genome of X. albilineans was sequenced and annotated. X. albilineans, which is not known to be insect-vectored, also has a reduced genome and does not possess hrp genes. Phylogenetic analysis using X. albilineans genomic sequences showed that X. fastidiosa belongs to the Xanthomonas group. Order of divergence of the Xanthomonadaceae revealed that X. albilineans and X. fastidiosa experienced a convergent reductive genome evolution during their descent from the progenitor of the Xanthomonas genus. Reductive genome evolutions of the two xylem-limited Xanthomonadaceae were compared in light of their genome characteristics and those of obligate animal symbionts and pathogens. Conclusion The two xylem-limited Xanthomonadaceae, during their descent from a common ancestral parent, experienced a convergent reductive genome evolution. Adaptation to the nutrient-poor xylem elements and to the cloistered environmental niche of xylem vessels probably favoured this convergent evolution. However, genome characteristics of X. albilineans differ from those of X. fastidiosa and obligate animal symbionts and pathogens, indicating that a distinctive process was responsible for the reductive genome evolution in this pathogen. The possible role in genome reduction of the unique toxin albicidin, produced by X. albilineans, is discussed.

  17. The complete genome sequence of Xanthomonas albilineans provides new insights into the reductive genome evolution of the xylem-limited Xanthomonadaceae.

    Science.gov (United States)

    Pieretti, Isabelle; Royer, Monique; Barbe, Valérie; Carrere, Sébastien; Koebnik, Ralf; Cociancich, Stéphane; Couloux, Arnaud; Darrasse, Armelle; Gouzy, Jérôme; Jacques, Marie-Agnès; Lauber, Emmanuelle; Manceau, Charles; Mangenot, Sophie; Poussier, Stéphane; Segurens, Béatrice; Szurek, Boris; Verdier, Valérie; Arlat, Matthieu; Rott, Philippe

    2009-12-17

    The Xanthomonadaceae family contains two xylem-limited plant pathogenic bacterial species, Xanthomonas albilineans and Xylella fastidiosa. X. fastidiosa was the first completely sequenced plant pathogen. It is insect-vectored, has a reduced genome and does not possess hrp genes which encode a Type III secretion system found in most plant pathogenic bacteria. X. fastidiosa was excluded from the Xanthomonas group based on phylogenetic analyses with rRNA sequences. The complete genome of X. albilineans was sequenced and annotated. X. albilineans, which is not known to be insect-vectored, also has a reduced genome and does not possess hrp genes. Phylogenetic analysis using X. albilineans genomic sequences showed that X. fastidiosa belongs to the Xanthomonas group. Order of divergence of the Xanthomonadaceae revealed that X. albilineans and X. fastidiosa experienced a convergent reductive genome evolution during their descent from the progenitor of the Xanthomonas genus. Reductive genome evolutions of the two xylem-limited Xanthomonadaceae were compared in light of their genome characteristics and those of obligate animal symbionts and pathogens. The two xylem-limited Xanthomonadaceae, during their descent from a common ancestral parent, experienced a convergent reductive genome evolution. Adaptation to the nutrient-poor xylem elements and to the cloistered environmental niche of xylem vessels probably favoured this convergent evolution. However, genome characteristics of X. albilineans differ from those of X. fastidiosa and obligate animal symbionts and pathogens, indicating that a distinctive process was responsible for the reductive genome evolution in this pathogen. The possible role in genome reduction of the unique toxin albicidin, produced by X. albilineans, is discussed.

  18. Editorial: Special Issue on Algorithms for Sequence Analysis and Storage

    Directory of Open Access Journals (Sweden)

    Veli Mäkinen

    2014-03-01

    Full Text Available This special issue of Algorithms is dedicated to approaches to biological sequence analysis that have algorithmic novelty and potential for fundamental impact in methods used for genome research.

  19. Functional and comparative analysis of expressed sequences from ...

    African Journals Online (AJOL)

    Functional and comparative analysis of expressed sequences from Diuraphis noxia infested wheat obtained utilizing the conserved Nucleotide Binding Site. Lynelle Lacock, Chantal van Niekerk, Shilo Loots, Franco du Preez, Anna-Maria Botha ...

  20. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  1. ECHO Project: a series of tools for studying and characterizing seismic sequences evolution

    Science.gov (United States)

    Falcone, Giuseppe; De Santis, Angelo; Di Giovambattista, Rita; Cianchini, Gianfranco; Murru, Maura; Calderoni, Giovanna; Lucente, Pio Francesco; De Gori, Pasquale; Frepoli, Alberto; Signanini, Patrizio; Rainone, Mario; Vessia, Giovanna

    2016-04-01

    One of the most ubiquitous problems in seismology is to discriminate between seismic sequences (a series of small-to-moderate earthquakes that culminate with a mainshock) and swarms (diffuse seismicity w/o mainshock), that can be easily done only after a certain class of earthquakes have occurred. We propose to put these phenomena under the same framework provided by the geosystemics (De Santis, 2009, 2014), where the planet Earth and its processes are seen from a holistic point of view, and the New Geophysics (Crampin et al., 2013), where fluid-saturated microcracks in almost all crustal rocks are so closely-spaced they verge on failure and hence are highly-compliant critical systems (Signanini and De Santis, 2012). In this context, nonlinear concepts typical of Chaos and Information theories are fundamental to study and characterize the various features of the series of seismic events, and, eventually, to discriminate between seismic sequences and swarms. The two theories imply the use of non-linear techniques which are innovative in seismology. The project ECHO ("Entropy and CHaOs: tools for studying and characterizing seismic sequences evolution"), a recent INGV-funded project, would aim at applying the above approaches in a more integrated way mainly to establish a suite of effective tools to disclose and characterise the principal features of the series of earthquakes which are of interest. In our view this will represent the very first step before to face the more challenging (but longer-term) problem of discriminating between the two kinds of series of seismic events. This poster will describe these kinds of preliminary activities and relative results in the framework of the project.

  2. Tools for integrated sequence-structure analysis with UCSF Chimera

    Directory of Open Access Journals (Sweden)

    Huang Conrad C

    2006-07-01

    Full Text Available Abstract Background Comparing related structures and viewing the structures in the context of sequence alignments are important tasks in protein structure-function research. While many programs exist for individual aspects of such work, there is a need for interactive visualization tools that: (a provide a deep integration of sequence and structure, far beyond mapping where a sequence region falls in the structure and vice versa; (b facilitate changing data of one type based on the other (for example, using only sequence-conserved residues to match structures, or adjusting a sequence alignment based on spatial fit; (c can be used with a researcher's own data, including arbitrary sequence alignments and annotations, closely or distantly related sets of proteins, etc.; and (d interoperate with each other and with a full complement of molecular graphics features. We describe enhancements to UCSF Chimera to achieve these goals. Results The molecular graphics program UCSF Chimera includes a suite of tools for interactive analyses of sequences and structures. Structures automatically associate with sequences in imported alignments, allowing many kinds of crosstalk. A novel method is provided to superimpose structures in the absence of a pre-existing sequence alignment. The method uses both sequence and secondary structure, and can match even structures with very low sequence identity. Another tool constructs structure-based sequence alignments from superpositions of two or more proteins. Chimera is designed to be extensible, and mechanisms for incorporating user-specific data without Chimera code development are also provided. Conclusion The tools described here apply to many problems involving comparison and analysis of protein structures and their sequences. Chimera includes complete documentation and is intended for use by a wide range of scientists, not just those in the computational disciplines. UCSF Chimera is free for non-commercial use and is

  3. Tools for integrated sequence-structure analysis with UCSF Chimera.

    Science.gov (United States)

    Meng, Elaine C; Pettersen, Eric F; Couch, Gregory S; Huang, Conrad C; Ferrin, Thomas E

    2006-07-12

    Comparing related structures and viewing the structures in the context of sequence alignments are important tasks in protein structure-function research. While many programs exist for individual aspects of such work, there is a need for interactive visualization tools that: (a) provide a deep integration of sequence and structure, far beyond mapping where a sequence region falls in the structure and vice versa; (b) facilitate changing data of one type based on the other (for example, using only sequence-conserved residues to match structures, or adjusting a sequence alignment based on spatial fit); (c) can be used with a researcher's own data, including arbitrary sequence alignments and annotations, closely or distantly related sets of proteins, etc.; and (d) interoperate with each other and with a full complement of molecular graphics features. We describe enhancements to UCSF Chimera to achieve these goals. The molecular graphics program UCSF Chimera includes a suite of tools for interactive analyses of sequences and structures. Structures automatically associate with sequences in imported alignments, allowing many kinds of crosstalk. A novel method is provided to superimpose structures in the absence of a pre-existing sequence alignment. The method uses both sequence and secondary structure, and can match even structures with very low sequence identity. Another tool constructs structure-based sequence alignments from superpositions of two or more proteins. Chimera is designed to be extensible, and mechanisms for incorporating user-specific data without Chimera code development are also provided. The tools described here apply to many problems involving comparison and analysis of protein structures and their sequences. Chimera includes complete documentation and is intended for use by a wide range of scientists, not just those in the computational disciplines. UCSF Chimera is free for non-commercial use and is available for Microsoft Windows, Apple Mac OS X

  4. Evolution of EF-hand calcium-modulated proteins. III. Exon sequences confirm most dendrograms based on protein sequences: calmodulin dendrograms show significant lack of parallelism

    Science.gov (United States)

    Nakayama, S.; Kretsinger, R. H.

    1993-01-01

    In the first report in this series we presented dendrograms based on 152 individual proteins of the EF-hand family. In the second we used sequences from 228 proteins, containing 835 domains, and showed that eight of the 29 subfamilies are congruent and that the EF-hand domains of the remaining 21 subfamilies have diverse evolutionary histories. In this study we have computed dendrograms within and among the EF-hand subfamilies using the encoding DNA sequences. In most instances the dendrograms based on protein and on DNA sequences are very similar. Significant differences between protein and DNA trees for calmodulin remain unexplained. In our fourth report we evaluate the sequences and the distribution of introns within the EF-hand family and conclude that exon shuffling did not play a significant role in its evolution.

  5. Single-virion sequencing of lamivudine-treated HBV populations reveal population evolution dynamics and demographic history.

    Science.gov (United States)

    Zhu, Yuan O; Aw, Pauline P K; de Sessions, Paola Florez; Hong, Shuzhen; See, Lee Xian; Hong, Lewis Z; Wilm, Andreas; Li, Chen Hao; Hue, Stephane; Lim, Seng Gee; Nagarajan, Niranjan; Burkholder, William F; Hibberd, Martin

    2017-10-27

    Viral populations are complex, dynamic, and fast evolving. The evolution of groups of closely related viruses in a competitive environment is termed quasispecies. To fully understand the role that quasispecies play in viral evolution, characterizing the trajectories of viral genotypes in an evolving population is the key. In particular, long-range haplotype information for thousands of individual viruses is critical; yet generating this information is non-trivial. Popular deep sequencing methods generate relatively short reads that do not preserve linkage information, while third generation sequencing methods have higher error rates that make detection of low frequency mutations a bioinformatics challenge. Here we applied BAsE-Seq, an Illumina-based single-virion sequencing technology, to eight samples from four chronic hepatitis B (CHB) patients - once before antiviral treatment and once after viral rebound due to resistance. With single-virion sequencing, we obtained 248-8796 single-virion sequences per sample, which allowed us to find evidence for both hard and soft selective sweeps. We were able to reconstruct population demographic history that was independently verified by clinically collected data. We further verified four of the samples independently through PacBio SMRT and Illumina Pooled deep sequencing. Overall, we showed that single-virion sequencing yields insight into viral evolution and population dynamics in an efficient and high throughput manner. We believe that single-virion sequencing is widely applicable to the study of viral evolution in the context of drug resistance and host adaptation, allows differentiation between soft or hard selective sweeps, and may be useful in the reconstruction of intra-host viral population demographic history.

  6. Quantiprot - a Python package for quantitative analysis of protein sequences.

    Science.gov (United States)

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  7. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  8. EVOLUTION OF GROUP GALAXIES FROM THE FIRST RED-SEQUENCE CLUSTER SURVEY

    Energy Technology Data Exchange (ETDEWEB)

    Li, I. H. [Centre for Astrophysics and Supercomputing, Swinburne University of Technology, P.O. Box 218, Hawthorn, Victoria 3122 (Australia); Yee, H. K. C. [Department of Astronomy and Astrophysics, University of Toronto, 50 St. George Street, Toronto, ON M5S 3H4 (Canada); Hsieh, B. C. [Institute of Astronomy and Astrophysics, Academia Sinica, P.O. Box 23-141, Taipei 106, Taiwan (China); Gladders, M., E-mail: tli@astro.swin.edu.au, E-mail: hyee@astro.utoronto.ca, E-mail: bchsieh@asiaa.sinica.edu.tw, E-mail: gladders@oddjob.uchicago.edu [Department of Astronomy and Astrophysics, University of Chicago, 5640 S. Ellis Ave, Chicago, IL 60637 (United States)

    2012-04-20

    We study the evolution of the red-galaxy fraction (f{sub red}) in 905 galaxy groups with 0.15 {<=} z < 0.52. The galaxy groups are identified by the 'probability friends-of-friends' algorithm from the first Red-Sequence Cluster Survey (RCS1) photometric-redshift sample. There is a high degree of uniformity in the properties of the red sequence of the group galaxies, indicating that the luminous red-sequence galaxies in the groups are already in place by z {approx} 0.5 and that they have a formation epoch of z {approx}> 2. In general, groups at lower redshifts exhibit larger f{sub red} than those at higher redshifts, showing a group Butcher-Oemler effect. We investigate the evolution of f{sub red} by examining its dependence on four parameters, one of which can be classified as intrinsic and three of which can be classified as environmental: galaxy stellar mass (M{sub *}), total group stellar mass (M{sub *,grp}, a proxy for group halo mass), normalized group-centric radius (r{sub grp}), and local galaxy density ({Sigma}{sub 5}). We find that M{sub *} is the dominant parameter such that there is a strong correlation between f{sub red} and galaxy stellar mass. Furthermore, the dependence of f{sub red} on the environmental parameters is also a strong function of M{sub *}. Massive galaxies (M{sub *} {approx}> 10{sup 11} M{sub Sun }) show little dependence of f{sub red} on r{sub grp}, M{sub *,grp}, and {Sigma}{sub 5} over the redshift range. The dependence of f{sub red} on these parameters is primarily seen for galaxies with lower masses, especially for M{sub *} {approx}< 10{sup 10.6} M{sub Sun }. We observe an apparent 'group down-sizing' effect, in that galaxies in lower-mass halos, after controlling for galaxy stellar mass, have lower f{sub red}. We find a dependence of f{sub red} on both r{sub grp} and {Sigma}{sub 5} after the other parameters are controlled. At a fixed r{sub grp}, there is a significant dependence of f{sub red} on {Sigma}{sub 5

  9. Sequencing and analysis of the Mediterranean amphioxus (Branchiostoma lanceolatum transcriptome.

    Directory of Open Access Journals (Sweden)

    Silvan Oulion

    Full Text Available BACKGROUND: The basally divergent phylogenetic position of amphioxus (Cephalochordata, as well as its conserved morphology, development and genetics, make it the best proxy for the chordate ancestor. Particularly, studies using the amphioxus model help our understanding of vertebrate evolution and development. Thus, interest for the amphioxus model led to the characterization of both the transcriptome and complete genome sequence of the American species, Branchiostoma floridae. However, recent technical improvements allowing induction of spawning in the laboratory during the breeding season on a daily basis with the Mediterranean species Branchiostoma lanceolatum have encouraged European Evo-Devo researchers to adopt this species as a model even though no genomic or transcriptomic data have been available. To fill this need we used the pyrosequencing method to characterize the B. lanceolatum transcriptome and then compared our results with the published transcriptome of B. floridae. RESULTS: Starting with total RNA from nine different developmental stages of B. lanceolatum, a normalized cDNA library was constructed and sequenced on Roche GS FLX (Titanium mode. Around 1.4 million of reads were produced and assembled into 70,530 contigs (average length of 490 bp. Overall 37% of the assembled sequences were annotated by BlastX and their Gene Ontology terms were determined. These results were then compared to genomic and transcriptomic data of B. floridae to assess similarities and specificities of each species. CONCLUSION: We obtained a high-quality amphioxus (B. lanceolatum reference transcriptome using a high throughput sequencing approach. We found that 83% of the predicted genes in the B. floridae complete genome sequence are also found in the B. lanceolatum transcriptome, while only 41% were found in the B. floridae transcriptome obtained with traditional Sanger based sequencing. Therefore, given the high degree of sequence conservation

  10. Whole-genome sequencing reveals the mechanisms for evolution of streptomycin resistance in Lactobacillus plantarum.

    Science.gov (United States)

    Zhang, Fuxin; Gao, Jiayuan; Wang, Bini; Huo, Dongxue; Wang, Zhaoxia; Zhang, Jiachao; Shao, Yuyu

    2018-01-31

    In this research, we investigated the evolution of streptomycin resistance in Lactobacillus plantarum ATCC14917, which was passaged in medium containing a gradually increasing concentration of streptomycin. After 25 d, the minimum inhibitory concentration (MIC) of L. plantarum ATCC14917 had reached 131,072 µg/mL, which was 8,192-fold higher than the MIC of the original parent isolate. The highly resistant L. plantarum ATCC14917 isolate was then passaged in antibiotic-free medium to determine the stability of resistance. The MIC value of the L. plantarum ATCC14917 isolate decreased to 2,048 µg/mL after 35 d but remained constant thereafter, indicating that resistance was irreversible even in the absence of selection pressure. Whole-genome sequencing of parent isolates, control isolates, and isolates following passage was used to study the resistance mechanism of L. plantarum ATCC14917 to streptomycin and adaptation in the presence and absence of selection pressure. Five mutated genes (single nucleotide polymorphisms and structural variants) were verified in highly resistant L. plantarum ATCC14917 isolates, which were related to ribosomal protein S12, LPXTG-motif cell wall anchor domain protein, LrgA family protein, Ser/Thr phosphatase family protein, and a hypothetical protein that may correlate with resistance to streptomycin. After passage in streptomycin-free medium, only the mutant gene encoding ribosomal protein S12 remained; the other 4 mutant genes had reverted to the wild type as found in the parent isolate. Although the MIC value of L. plantarum ATCC14917 was reduced in the absence of selection pressure, it remained 128-fold higher than the MIC value of the parent isolate, indicating that ribosomal protein S12 may play an important role in streptomycin resistance. Using the mobile elements database, we demonstrated that streptomycin resistance-related genes in L. plantarum ATCC14917 were not located on mobile elements. This research offers a way of

  11. The evolution of proteins from random amino acid sequences: II. Evidence from the statistical distributions of the lengths of modern protein sequences.

    Science.gov (United States)

    White, S H

    1994-04-01

    entirely consistent with the observations of Brown et al. (1990a,b, Nucleic Acids Res 18:2079-2086 and 18: 6339-6345) which show that tetra-nucleotides (stop codon plus following nucleotide) are the actual signals for termination of translation in both prokaryotes and eukaryotes. Second, the strong dependence of statistical length distributions on sequence-termination signaling codes implies that the evolution of stop codons and translation-termination processes was as important as gene splicing in early evolution. Third, because the theory is based upon a simple no-exon stochastic model, it provides a plausible alternative to a limited universe of exons from which all proteins evolved by gene duplication and exon splicing (Dorit et al. 1990, Science 250:1377-1382).

  12. Mammalian comparative sequence analysis of the Agrp locus.

    Directory of Open Access Journals (Sweden)

    Christopher B Kaelin

    2007-08-01

    Full Text Available Agouti-related protein encodes a neuropeptide that stimulates food intake. Agrp expression in the brain is restricted to neurons in the arcuate nucleus of the hypothalamus and is elevated by states of negative energy balance. The molecular mechanisms underlying Agrp regulation, however, remain poorly defined. Using a combination of transgenic and comparative sequence analysis, we have previously identified a 760 bp conserved region upstream of Agrp which contains STAT binding elements that participate in Agrp transcriptional regulation. In this study, we attempt to improve the specificity for detecting conserved elements in this region by comparing genomic sequences from 10 mammalian species. Our analysis reveals a symmetrical organization of conserved sequences upstream of Agrp, which cluster into two inverted repeat elements. Conserved sequences within these elements suggest a role for homeodomain proteins in the regulation of Agrp and provide additional targets for functional evaluation.

  13. Deep Sequencing Analysis of Nucleolar Small RNAs: Bioinformatics.

    Science.gov (United States)

    Bai, Baoyan; Laiho, Marikki

    2016-01-01

    Small RNAs (size 20-30 nt) of various types have been actively investigated in recent years, and their subcellular compartmentalization and relative concentrations are likely to be of importance to their cellular and physiological functions. Comprehensive data on this subset of the transcriptome can only be obtained by application of high-throughput sequencing, which yields data that are inherently complex and multidimensional, as sequence composition, length, and abundance will all inform to the small RNA function. Subsequent data analysis, hypothesis testing, and presentation/visualization of the results are correspondingly challenging. We have constructed small RNA libraries derived from different cellular compartments, including the nucleolus, and asked whether small RNAs exist in the nucleolus and whether they are distinct from cytoplasmic and nuclear small RNAs, the miRNAs. Here, we present a workflow for analysis of small RNA sequencing data generated by the Ion Torrent PGM sequencer from samples derived from different cellular compartments.

  14. DNA shotgun sequencing analysis of Garcinia mangostana L. variety Mesta

    Directory of Open Access Journals (Sweden)

    Syuhaidah Abu Bakar

    2017-06-01

    Full Text Available Mangosteen (Garcinia mangostana Linn. is an ultra-tropical tree characterized by its unique dark purple fruits with white flesh. The xanthone-rich purple pericarp tissue contains valuable compounds with medicinal properties. Following previously reported genome sequencing of a common variety of mangosteen [1], we performed another whole genome sequencing of a commercially popular variety of this fruit species (var. Mesta for comparative analysis of its genome composition. Raw reads of the DNA sequencing project were deposited to SRA database with the accession number SRX2709728.

  15. Pleiotropy constrains the evolution of protein but not regulatory sequences in a transcription regulatory network influencing complex social behaviours

    Directory of Open Access Journals (Sweden)

    Daria eMolodtsova

    2014-12-01

    Full Text Available It is increasingly apparent that genes and networks that influence complex behaviour are evolutionary conserved, which is paradoxical considering that behaviour is labile over evolutionary timescales. How does adaptive change in behaviour arise if behaviour is controlled by conserved, pleiotropic, and likely evolutionary constrained genes? Pleiotropy and connectedness are known to constrain the general rate of protein evolution, prompting some to suggest that the evolution of complex traits, including behaviour, is fuelled by regulatory sequence evolution. However, we seldom have data on the strength of selection on mutations in coding and regulatory sequences, and this hinders our ability to study how pleiotropy influences coding and regulatory sequence evolution. Here we use population genomics to estimate the strength of selection on coding and regulatory mutations for a transcriptional regulatory network that influences complex behaviour of honey bees. We found that replacement mutations in highly connected transcription factors and target genes experience significantly stronger negative selection relative to weakly connected transcription factors and targets. Adaptively evolving proteins were significantly more likely to reside at the periphery of the regulatory network, while proteins with signs of negative selection were near the core of the network. Interestingly, connectedness and network structure had minimal influence on the strength of selection on putative regulatory sequences for both transcription factors and their targets. Our study indicates that adaptive evolution of complex behaviour can arise because of positive selection on protein-coding mutations in peripheral genes, and on regulatory sequence mutations in both transcription factors and their targets throughout the network.

  16. The sequence and analysis of duplication rich human chromosome 16

    Energy Technology Data Exchange (ETDEWEB)

    Martin, J; Han, C; Gordon, L A; Terry, A; Prabhakar, S; She, X; Xie, G; Hellsten, U; Chan, Y M; Altherr, M; Couronne, O; Aerts, A; Bajorek, E; Black, S; Blumer, H; Branscomb, E; Brown, N; Bruno, W J; Buckingham, J; Callen, D F; Campbell, C S; Campbell, M L; Campbell, E W; Caoile, C; Challacombe, J F; Chasteen, L A; Chertkov, O; Chi, H C; Christensen, M; Clark, L M; Cohn, J D; Denys, M; Detter, J C; Dickson, M; Dimitrijevic-Bussod, M; Escobar, J; Fawcett, J J; Flowers, D; Fotopulos, D; Glavina, T; Gomez, M; Gonzales, E; Goodstein, D; Goodwin, L A; Grady, D L; Grigoriev, I; Groza, M; Hammon, N; Hawkins, T; Haydu, L; Hildebrand, C E; Huang, W; Israni, S; Jett, J; Jewett, P B; Kadner, K; Kimball, H; Kobayashi, A; Krawczyk, M; Leyba, T; Longmire, J L; Lopez, F; Lou, Y; Lowry, S; Ludeman, T; Manohar, C F; Mark, G A; McMurray, K L; Meincke, L J; Morgan, J; Moyzis, R K; Mundt, M O; Munk, A C; Nandkeshwar, R D; Pitluck, S; Pollard, M; Predki, P; Parson-Quintana, B; Ramirez, L; Rash, S; Retterer, J; Ricke, D O; Robinson, D; Rodriguez, A; Salamov, A; Saunders, E H; Scott, D; Shough, T; Stallings, R L; Stalvey, M; Sutherland, R D; Tapia, R; Tesmer, J G; Thayer, N; Thompson, L S; Tice, H; Torney, D C; Tran-Gyamfi, M; Tsai, M; Ulanovsky, L E; Ustaszewska, A; Vo, N; White, P S; Williams, A L; Wills, P L; Wu, J; Wu, K; Yang, J; DeJong, P; Bruce, D; Doggett, N A; Deaven, L; Schmutz, J; Grimwood, J; Richardson, P; Rokhsar, D S; Eichler, E E; Gilna, P; Lucas, S M; Myers, R M; Rubin, E M; Pennacchio, L A

    2005-04-06

    Human chromosome 16 features one of the highest levels of segmentally duplicated sequence among the human autosomes. We report here the 78,884,754 base pairs of finished chromosome 16 sequence, representing over 99.9% of its euchromatin. Manual annotation revealed 880 protein-coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes, and 3 RNA pseudogenes. These genes include metallothionein, cadherin, and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobase pairs were identified and result in gene content differences among humans. While the segmental duplications of chromosome 16 are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events likely to have had an impact on the evolution of primates and human disease susceptibility.

  17. DNA sequence and analysis of human chromosome 9

    OpenAIRE

    Humphray, S. J.; Oliver, K.; Hunt, A. R.; Plumb, R. W.; Loveland, J. E.; Howe, K. L.; Andrews, T. D.; Searle, S.; Hunt, S. E.; Scott, C. E.; Jones, M. C.; Ainscough, R.; Almeida, J. P.; Ambrose, K. D.; Ashwell, R. I. S.

    2004-01-01

    Chromosome 9 is highly structurally polymorphic. It contains the largest autosomal block of heterochromatin, which is heteromorphic in 6–8% of humans, whereas pericentric inversions occur in more than 1% of the population. The finished euchromatic sequence of chromosome 9 comprises 109,044,351 base pairs and represents >99.6% of the region. Analysis of the sequence reveals many intra- and interchromosomal duplications, including segmental duplications adjacent to both the centromere and the l...

  18. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  19. Molecular cloning and sequence analysis of the cat myostatin gene ...

    African Journals Online (AJOL)

    ... MEF3, MTBF, PAX3, SMAD, HBOX, HOMF and TEAF motifs. Comparative analysis for some motifs showed both conservations and differences among cat, horse, porcine and human. Key words: Cat, myostatin 5'-regulatory region, molecular cloning, sequence analysis and comparison, transcription factor binding sites.

  20. An optimum analysis sequence for environmental gamma-ray spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L., E-mail: fta777@hotmail.co [Universidad Autonoma de Zacatecas, Centro Regional de Estudis Nucleares, Calle Cipres No. 10, Fracc. La Penuela, 98068 Zacatecas (Mexico)

    2010-10-15

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced {chi}{sup 2} criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  1. Sequencing and Comparative Analysis of a Conserved Syntenic Segment in the Solanaceae

    Science.gov (United States)

    Wang, Ying; Diehl, Adam; Wu, Feinan; Vrebalov, Julia; Giovannoni, James; Siepel, Adam; Tanksley, Steven D.

    2008-01-01

    Comparative genomics is a powerful tool for gaining insight into genomic function and evolution. However, in plants, sequence data that would enable detailed comparisons of both coding and noncoding regions have been limited in availability. Here we report the generation and analysis of sequences for an unduplicated conserved syntenic segment (CSS) in the genomes of five members of the agriculturally important plant family Solanaceae. This CSS includes a 105-kb region of tomato chromosome 2 and orthologous regions of the potato, eggplant, pepper, and petunia genomes. With a total neutral divergence of 0.73–0.78 substitutions/site, these sequences are similar enough that most noncoding regions can be aligned, yet divergent enough to be informative about evolutionary dynamics and selective pressures. The CSS contains 17 distinct genes with generally conserved order and orientation, but with numerous small-scale differences between species. Our analysis indicates that the last common ancestor of these species lived ∼27–36 million years ago, that more than one-third of short genomic segments (5–15 bp) are under selection, and that more than two-thirds of selected bases fall in noncoding regions. In addition, we identify genes under positive selection and analyze hundreds of conserved noncoding elements. This analysis provides a window into 30 million years of plant evolution in the absence of polyploidization. PMID:18723883

  2. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development.

    Science.gov (United States)

    Renfree, Marilyn B; Papenfuss, Anthony T; Deakin, Janine E; Lindsay, James; Heider, Thomas; Belov, Katherine; Rens, Willem; Waters, Paul D; Pharo, Elizabeth A; Shaw, Geoff; Wong, Emily S W; Lefèvre, Christophe M; Nicholas, Kevin R; Kuroki, Yoko; Wakefield, Matthew J; Zenger, Kyall R; Wang, Chenwei; Ferguson-Smith, Malcolm; Nicholas, Frank W; Hickford, Danielle; Yu, Hongshi; Short, Kirsty R; Siddle, Hannah V; Frankenberg, Stephen R; Chew, Keng Yih; Menzies, Brandon R; Stringer, Jessica M; Suzuki, Shunsuke; Hore, Timothy A; Delbridge, Margaret L; Patel, Hardip R; Mohammadi, Amir; Schneider, Nanette Y; Hu, Yanqiu; O'Hara, William; Al Nadaf, Shafagh; Wu, Chen; Feng, Zhi-Ping; Cocks, Benjamin G; Wang, Jianghui; Flicek, Paul; Searle, Stephen M J; Fairley, Susan; Beal, Kathryn; Herrero, Javier; Carone, Dawn M; Suzuki, Yutaka; Sugano, Sumio; Toyoda, Atsushi; Sakaki, Yoshiyuki; Kondo, Shinji; Nishida, Yuichiro; Tatsumoto, Shoji; Mandiou, Ion; Hsu, Arthur; McColl, Kaighin A; Lansdell, Benjamin; Weinstock, George; Kuczek, Elizabeth; McGrath, Annette; Wilson, Peter; Men, Artem; Hazar-Rethinam, Mehlika; Hall, Allison; Davis, John; Wood, David; Williams, Sarah; Sundaravadanam, Yogi; Muzny, Donna M; Jhangiani, Shalini N; Lewis, Lora R; Morgan, Margaret B; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Nazareth, Lynne; Cree, Andrew; Fowler, Gerald; Kovar, Christie L; Dinh, Huyen H; Joshi, Vandita; Jing, Chyn; Lara, Fremiet; Thornton, Rebecca; Chen, Lei; Deng, Jixin; Liu, Yue; Shen, Joshua Y; Song, Xing-Zhi; Edson, Janette; Troon, Carmen; Thomas, Daniel; Stephens, Amber; Yapa, Lankesha; Levchenko, Tanya; Gibbs, Richard A; Cooper, Desmond W; Speed, Terence P; Fujiyama, Asao; Graves, Jennifer A M; O'Neill, Rachel J; Pask, Andrew J; Forrest, Susan M; Worley, Kim C

    2011-08-29

    We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution.

  3. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development

    Science.gov (United States)

    2011-01-01

    Background We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. Results The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Conclusions Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution. PMID:21854559

  4. Phylogenetic Analysis of the Bifidobacterium Genus Using Glycolysis Enzyme Sequences

    Science.gov (United States)

    Brandt, Katelyn; Barrangou, Rodolphe

    2016-01-01

    Bifidobacteria are important members of the human gastrointestinal tract that promote the establishment of a healthy microbial consortium in the gut of infants. Recent studies have established that the Bifidobacterium genus is a polymorphic phylogenetic clade, which encompasses a diversity of species and subspecies that encode a broad range of proteins implicated in complex and non-digestible carbohydrate uptake and catabolism, ranging from human breast milk oligosaccharides, to plant fibers. Recent genomic studies have created a need to properly place Bifidobacterium species in a phylogenetic tree. Current approaches, based on core-genome analyses come at the cost of intensive sequencing and demanding analytical processes. Here, we propose a typing method based on sequences of glycolysis genes and the proteins they encode, to provide insights into diversity, typing, and phylogeny in this complex and broad genus. We show that glycolysis genes occur broadly in these genomes, to encode the machinery necessary for the biochemical spine of the cell, and provide a robust phylogenetic marker. Furthermore, glycolytic sequences-based trees are congruent with both the classical 16S rRNA phylogeny, and core genome-based strain clustering. Furthermore, these glycolysis markers can also be used to provide insights into the adaptive evolution of this genus, especially with regards to trends toward a high GC content. This streamlined method may open new avenues for phylogenetic studies on a broad scale, given the widespread occurrence of the glycolysis pathway in bacteria, and the diversity of the sequences they encode. PMID:27242688

  5. Characterization of shale gas enrichment in the Wufeng Formation–Longmaxi Formation in the Sichuan Basin of China and evaluation of its geological construction–transformation evolution sequence

    Directory of Open Access Journals (Sweden)

    Zhiliang He

    2017-02-01

    Full Text Available Shale gas in Upper Ordovician Wufeng Formation–Lower Silurian Longmaxi Formation in the Sichuan Basin is one of the key strata being explored and developed in China, where shale gas reservoirs have been found in Fuling, Weiyuan, Changning and Zhaotong. Characteristics of shale gas enrichment in the formation shown by detailed profiling and analysis are summarized as “high, handsome and rich”. “High” mainly refers to the high quality of original materials for the formation of shale with excellent key parameters, including the good type and high abundance of organic matters, high content of brittle minerals and moderate thermal evolution. “Handsome” means late and weak deformation, favorable deformation mode and structure, and appropriate uplift and current burial depth. “Rich” includes high gas content, high formation pressure coefficient, good reservoir property, favorable reservoir scale transformation and high initial and final output, with relative ease of development and obvious economic benefit. For shale gas enrichment and high yield, it is important that the combination of shale was deposited and formed in excellent conditions (geological construction, and then underwent appropriate tectonic deformation, uplift, and erosion (geological transformation. Evaluation based on geological construction (evolution sequence from formation to the reservoir includes sequence stratigraphy and sediment, hydrocarbon generation and formation of reservoir pores. Based on geological transformation (evolution sequence from the reservoir to preservation, the strata should be evaluated for structural deformation, the formation of reservoir fracture and preservation of shale gas. The evaluation of the “construction - transformation” sequence is to cover the whole process of shale gas formation and preservation. This way, both positive and negative effects of the formation-transformation sequence on shale gas are assessed. The evaluation

  6. Bioinformatics analysis of circulating cell-free DNA sequencing data.

    Science.gov (United States)

    Chan, Landon L; Jiang, Peiyong

    2015-10-01

    The discovery of cell-free DNA molecules in plasma has opened up numerous opportunities in noninvasive diagnosis. Cell-free DNA molecules have become increasingly recognized as promising biomarkers for detection and management of many diseases. The advent of next generation sequencing has provided unprecedented opportunities to scrutinize the characteristics of cell-free DNA molecules in plasma in a genome-wide fashion and at single-base resolution. Consequently, clinical applications of circulating cell-free DNA analysis have not only revolutionized noninvasive prenatal diagnosis but also facilitated cancer detection and monitoring toward an era of blood-based personalized medicine. With the remarkably increasing throughput and lowering cost of next generation sequencing, bioinformatics analysis becomes increasingly demanding to understand the large amount of data generated by these sequencing platforms. In this Review, we highlight the major bioinformatics algorithms involved in the analysis of cell-free DNA sequencing data. Firstly, we briefly describe the biological properties of these molecules and provide an overview of the general bioinformatics approach for the analysis of cell-free DNA. Then, we discuss the specific upstream bioinformatics considerations concerning the analysis of sequencing data of circulating cell-free DNA, followed by further detailed elaboration on each key clinical situation in noninvasive prenatal diagnosis and cancer management where downstream bioinformatics analysis is heavily involved. We also discuss bioinformatics analysis as well as clinical applications of the newly developed massively parallel bisulfite sequencing of cell-free DNA. Finally, we offer our perspectives on the future development of bioinformatics in noninvasive diagnosis. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  7. Whole genome sequencing reveals complex evolution patterns of multidrug-resistant Mycobacterium tuberculosis Beijing strains in patients.

    Directory of Open Access Journals (Sweden)

    Matthias Merker

    Full Text Available Multidrug-resistant (MDR Mycobacterium tuberculosis complex (MTBC strains represent a major threat for tuberculosis (TB control. Treatment of MDR-TB patients is long and less effective, resulting in a significant number of treatment failures. The development of further resistances leads to extensively drug-resistant (XDR variants. However, data on the individual reasons for treatment failure, e.g. an induced mutational burst, and on the evolution of bacteria in the patient are only sparsely available. To address this question, we investigated the intra-patient evolution of serial MTBC isolates obtained from three MDR-TB patients undergoing longitudinal treatment, finally leading to XDR-TB. Sequential isolates displayed identical IS6110 fingerprint patterns, suggesting the absence of exogenous re-infection. We utilized whole genome sequencing (WGS to screen for variations in three isolates from Patient A and four isolates from Patient B and C, respectively. Acquired polymorphisms were subsequently validated in up to 15 serial isolates by Sanger sequencing. We determined eight (Patient A and nine (Patient B polymorphisms, which occurred in a stepwise manner during the course of the therapy and were linked to resistance or a potential compensatory mechanism. For both patients, our analysis revealed the long-term co-existence of clonal subpopulations that displayed different drug resistance allele combinations. Out of these, the most resistant clone was fixed in the population. In contrast, baseline and follow-up isolates of Patient C were distinguished each by eleven unique polymorphisms, indicating an exogenous re-infection with an XDR strain not detected by IS6110 RFLP typing. Our study demonstrates that intra-patient microevolution of MDR-MTBC strains under longitudinal treatment is more complex than previously anticipated. However, a mutator phenotype was not detected. The presence of different subpopulations might confound phenotypic and

  8. Transcriptome sequencing and De Novo analysis of Youngia japonica using the illumina platform.

    Directory of Open Access Journals (Sweden)

    Yulan Peng

    Full Text Available Youngia japonica, a weed species distributed worldwide, has been widely used in traditional Chinese medicine. It is an ideal plant for studying the evolution of Asteraceae plants because of its short life history and abundant source. However, little is known about its evolution and genetic diversity. In this study, de novo transcriptome sequencing was conducted for the first time for the comprehensive analysis of the genetic diversity of Y. japonica. The Y. japonica transcriptome was sequenced using Illumina paired-end sequencing technology. We produced 21,847,909 high-quality reads for Y. japonica and assembled them into contigs. A total of 51,850 unigenes were identified, among which 46,087 were annotated in the NCBI non-redundant protein database and 41,752 were annotated in the Swiss-Prot database. We mapped 9,125 unigenes onto 163 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database. In addition, 3,648 simple sequence repeats (SSRs were detected. Our data provide the most comprehensive transcriptome resource currently available for Y. japonica. C4 photosynthesis unigenes were found in the biological process of Y. japonica. There were 5596 unigenes related to defense response and 1344 ungienes related to signal transduction mechanisms (10.95%. These data provide insights into the genetic diversity of Y. japonica. Numerous SSRs contributed to the development of novel markers. These data may serve as a new valuable resource for genomic studies on Youngia and, more generally, Cichoraceae.

  9. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers

    Science.gov (United States)

    Pabinger, Stephan; Ernst, Karina; Pulverer, Walter; Kallmeyer, Rainer; Valdes, Ana M.; Metrustry, Sarah; Katic, Denis; Nuzzo, Angelo; Kriegner, Albert; Vierlinger, Klemens; Weinhaeusel, Andreas

    2016-01-01

    Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM). Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage. TABSAT is freely

  10. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers.

    Science.gov (United States)

    Pabinger, Stephan; Ernst, Karina; Pulverer, Walter; Kallmeyer, Rainer; Valdes, Ana M; Metrustry, Sarah; Katic, Denis; Nuzzo, Angelo; Kriegner, Albert; Vierlinger, Klemens; Weinhaeusel, Andreas

    2016-01-01

    Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM). Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage. TABSAT is freely

  11. General continuous-time Markov model of sequence evolution via insertions/deletions: are alignment probabilities factorable?

    Science.gov (United States)

    Ezawa, Kiyoshi

    2016-08-11

    Insertions and deletions (indels) account for more nucleotide differences between two related DNA sequences than substitutions do, and thus it is imperative to develop a stochastic evolutionary model that enables us to reliably calculate the probability of the sequence evolution through indel processes. Recently, indel probabilistic models are mostly based on either hidden Markov models (HMMs) or transducer theories, both of which give the indel component of the probability of a given sequence alignment as a product of either probabilities of column-to-column transitions or block-wise contributions along the alignment. However, it is not a priori clear how these models are related with any genuine stochastic evolutionary model, which describes the stochastic evolution of an entire sequence along the time-axis. Moreover, currently none of these models can fully accommodate biologically realistic features, such as overlapping indels, power-law indel-length distributions, and indel rate variation across regions. Here, we theoretically dissect the ab initio calculation of the probability of a given sequence alignment under a genuine stochastic evolutionary model, more specifically, a general continuous-time Markov model of the evolution of an entire sequence via insertions and deletions. Our model is a simple extension of the general "substitution/insertion/deletion (SID) model". Using the operator representation of indels and the technique of time-dependent perturbation theory, we express the ab initio probability as a summation over all alignment-consistent indel histories. Exploiting the equivalence relations between different indel histories, we find a "sufficient and nearly necessary" set of conditions under which the probability can be factorized into the product of an overall factor and the contributions from regions separated by gapless columns of the alignment, thus providing a sort of generalized HMM. The conditions distinguish evolutionary models with

  12. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    Science.gov (United States)

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  13. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  14. The Tangshan Earthquake Sequence and its implications for the evolution of the North China Basin

    Science.gov (United States)

    NáBěLek, John; Chen, Wang-Ping; Ye, Hong

    1987-11-01

    The 1976 Tangshan earthquake sequence that occurred on the northern margin of the North China sedimentary basin is one of the strongest and most destructive intracontinental earthquake sequences ever recorded by worldwide long-period seismograph networks. We have studied the source process of the six largest events of this earthquake sequence by formally inverting long-period P and SH wave seismograms from the World-Wide Standardized Seismographic Network. Our analysis shows that the main shock was caused by motion on at least three fault segments, each with a different orientation. The initial, and dominant, faulting occurred on two segments of a north-northeast trending right-lateral strike-slip fault system. The last stage of faulting had a significant component of thrusting and occurred on an east-northeast trending subsidiary fault at the southern end of the main fault. The largest aftershock occurred at the northeastern end of the main fault and had a nearly pure normal faulting mechanism with east-west striking nodal planes. This event occurred in a pull-apart region where the north-northeast trending main strike-slip fault system steps to the right. This region subsided by more than 1 m during the earthquake sequence. To the north of this step the main fault is delineated by numerous aftershocks whose mechanisms are by-and-large consistent with right-lateral slip on this segment. The large event in 1945 (M = 6.3) probably occurred on this part of the fault. A region of substantial subsidence (up to 1.5 m) associated with the earthquake sequence was also observed in the south, where the main fault system takes another step to the right. Two of the large aftershocks occurred nearby and had strike-slip mechanisms. The overall faulting process of the Tangshan sequence is well characterized by right-lateral slip on a set of right-stepping faults. The mechanisms and the points of nucleation of the strongest shocks, and the extent of the ruptures, appear to be

  15. A primary sequence analysis of the ARGONAUTE protein family in plants.

    Directory of Open Access Journals (Sweden)

    Daniel Rodriguez-Leal

    2016-08-01

    Full Text Available Small RNA (sRNA-mediated gene silencing represents a conserved regulatory mechanism controlling a wide diversity of developmental processes through interactions of sRNAs with proteins of the ARGONAUTE (AGO family. On the basis of a large phylogenetic analysis that includes 206 AGO genes belonging to 23 plant species, AGO genes group into four clades corresponding to the phylogenetic distribution proposed for the ten family members of Arabidopsis thaliana. A primary analysis of the corresponding protein sequences resulted in 50 sequences of amino acids (blocks conserved across their linear length. Protein members of the AGO4/6/8/9 and AGO1/10 clades are more conserved than members of the AGO5 and AGO2/3/7 clades. In addition to blocks containing components of the PIWI, PAZ, and DUF1785 domains, members of the AGO2/3/7 and AGO4/6/8/9 clades possess other consensus block sequences that are exclusive of members within these clades, suggesting unforeseen functional specialization revealed by their primary sequence. We also show that AGO proteins of animal and plant kingdoms share linear sequences of blocks that include motifs involved in posttranslational modifications such as those regulating AGO2 in humans and the PIWI protein AUBERGINE in Drosophila. Our results open possibilities for exploring new structural and functional aspects related to the evolution of AGO proteins within the plant kingdom, and their convergence with analogous proteins in mammals and invertebrates.

  16. Evolutive and regressive soil sequences for characterization of soils in laurel forest (Tenerife, Canary Islands

    Directory of Open Access Journals (Sweden)

    José Asterio Guerra-García

    2014-03-01

    Full Text Available Soil degradation processes have achieved the recognition of a global environmental problem in recent years. It has been suggested by various international forums and organizations that in order to adequately establish methods to combat land degradation, it is necessary to evaluate this degradation locally and at a detailed scale. The evaluation of soil degradation of natural ecosystems at a detailed scale requires the definition of standards to which to compare this degradation. To define these standards and properly handle the processes that give rise to variations in soil quality and degradation, it is necessary to establish in some detail the pedogenic processes that have or have not taken place in a particular area and which lead to the formation of a mature soil. A mature soil should be considered as standard in these situations and, therefore, a non-degraded soil. This paper presents the possible evolutive and regressive sequences of soil, and provides some examples of using this methodology to evaluate the degradation of the same in the Monteverde of the island of Tenerife. It also presents some physical, chemical and mineralogical properties of climacic mature soils, degraded soils and low quality soils, and examines their similarities and differences in this bioclimatic environment and on different parent materials. Thus it is observed that the main processes of degradation in these areas are related to plant cover modifications that lead to the decreasing protection of the soil surface, which results in the long term, in the onset of degradation processes such as water erosion, biological degradation, loss of andic properties, compaction and sealing and crusting surface, loss of water retention capacity, illuviation, etc. Climacic soils that can be found in areas of steep lava flows are Leptosols, while gently sloping areas are Cambisols and Andosols. On pyroclastic materials there are vitric Andosols and andic Andosols according to

  17. Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes

    Directory of Open Access Journals (Sweden)

    Rebecca M. Davidson

    2011-11-01

    Full Text Available Transcriptome sequencing is a powerful method for studying global expression patterns in large, complex genomes. Evaluation of sequence-based expression profiles during reproductive development would provide functional annotation to genes underlying agronomic traits. We generated transcriptome profiles for 12 diverse maize ( L. reproductive tissues representing male, female, developing seed, and leaf tissues using high throughput transcriptome sequencing. Overall, ∼80% of annotated genes were expressed. Comparative analysis between sequence and hybridization-based methods demonstrated the utility of ribonucleic acid sequencing (RNA-seq for expression determination and differentiation of paralagous genes (∼85% of maize genes. Analysis of 4975 gene families across reproductive tissues revealed expression divergence is proportional to family size. In all pairwise comparisons between tissues, 7 (pre- vs. postemergence cobs to 48% (pollen vs. ovule of genes were differentially expressed. Genes with expression restricted to a single tissue within this study were identified with the highest numbers observed in leaves, endosperm, and pollen. Coexpression network analysis identified 17 gene modules with complex and shared expression patterns containing many previously described maize genes. The data and analyses in this study provide valuable tools through improved gene annotation, gene family characterization, and a core set of candidate genes to further characterize maize reproductive development and improve grain yield potential.

  18. Untangling Heteroplasmy, Structure, and Evolution of an Atypical Mitochondrial Genome by PacBio Sequencing.

    Science.gov (United States)

    Peccoud, Jean; Chebbi, Mohamed Amine; Cormier, Alexandre; Moumen, Bouziane; Gilbert, Clément; Marcadé, Isabelle; Chandler, Christopher; Cordaux, Richard

    2017-09-01

    The highly compact mitochondrial (mt) genome of terrestrial isopods (Oniscidae) presents two unusual features. First, several loci can individually encode two tRNAs, thanks to single nucleotide polymorphisms at anticodon sites. Within-individual variation (heteroplasmy) at these loci is thought to have been maintained for millions of years because individuals that do not carry all tRNA genes die, resulting in strong balancing selection. Second, the oniscid mtDNA genome comes in two conformations: a ∼14 kb linear monomer and a ∼28 kb circular dimer comprising two monomer units fused in palindrome. We hypothesized that heteroplasmy actually results from two genome units of the same dimeric molecule carrying different tRNA genes at mirrored loci. This hypothesis, however, contradicts the earlier proposition that dimeric molecules result from the replication of linear monomers-a process that should yield totally identical genome units within a dimer. To solve this contradiction, we used the SMRT (PacBio) technology to sequence mirrored tRNA loci in single dimeric molecules. We show that dimers do present different tRNA genes at mirrored loci; thus covalent linkage, rather than balancing selection, maintains vital variation at anticodons. We also leveraged unique features of the SMRT technology to detect linear monomers closed by hairpins and carrying noncomplementary bases at anticodons. These molecules contain the necessary information to encode two tRNAs at the same locus, and suggest new mechanisms of transition between linear and circular mtDNA. Overall, our analyses clarify the evolution of an atypical mt genome where dimerization counterintuitively enabled further mtDNA compaction. Copyright © 2017 by the Genetics Society of America.

  19. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation

    Directory of Open Access Journals (Sweden)

    Apurva Barve

    2013-01-01

    Full Text Available Xeroderma pigmentosum group A (XPA is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1 and replication protein A 70 kDa subunit (RPA70 proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla.

  20. Structural and sequence similarities of hydra xeroderma pigmentosum A protein to human homolog suggest early evolution and conservation.

    Science.gov (United States)

    Barve, Apurva; Ghaskadbi, Saroj; Ghaskadbi, Surendra

    2013-01-01

    Xeroderma pigmentosum group A (XPA) is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER) pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1) and replication protein A 70 kDa subunit (RPA70) proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla.

  1. Research on the Compression Algorithm of the Infrared Thermal Image Sequence Based on Differential Evolution and Double Exponential Decay Model

    Directory of Open Access Journals (Sweden)

    Jin-Yu Zhang

    2014-01-01

    Full Text Available This paper has proposed a new thermal wave image sequence compression algorithm by combining double exponential decay fitting model and differential evolution algorithm. This study benchmarked fitting compression results and precision of the proposed method was benchmarked to that of the traditional methods via experiment; it investigated the fitting compression performance under the long time series and improved model and validated the algorithm by practical thermal image sequence compression and reconstruction. The results show that the proposed algorithm is a fast and highly precise infrared image data processing method.

  2. Sequence analysis of cereal sucrose synthase genes and isolation ...

    African Journals Online (AJOL)

    SERVER

    2007-10-18

    Oct 18, 2007 ... Full Length Research Paper. Sequence analysis of cereal sucrose synthase genes and isolation of sorghum sucrose synthase gene fragment. T. Sivasudha1* and P. A. Kumar2. 1Department of Environmental Biotechnology, Bharathidasan University, Tiruchy-620 024, India. 2NRC on Plant Biotechnology, ...

  3. Molecular cloning, sequence analysis and structure prediction of the ...

    African Journals Online (AJOL)

    Molecular cloning, sequence analysis and structure prediction of the related to b 0,+ amino acid transporter (rBAT) in Cyprinus carpio L. ... The amplified product was 2370 bp, including a 42 bp 5'-untranslated region, a 288 bp 3'-untranslated region, and a 2040 bp open reading frame (ORF), which encoded 679 amino acids ...

  4. BIOLOG - a DNA sequence analysis system in PROLOG.

    Science.gov (United States)

    Lyall, A; Hammond, P; Brough, D; Glover, D

    1984-01-11

    BIOLOG contains facilities for the analysis of nucleic acid sequences. These facilities are available through queries and commands of the underlying implementation language PROLOG. Familiarity with PROLOG is gained by using the built-in BIOLOG functions. This experience should enable the user to extend the current system and define new facilities.

  5. Inter simple sequence repeat analysis of genetic diversity of five ...

    African Journals Online (AJOL)

    This paper studied the genetic diversity of five cultivated pepper species using inter simple sequence repeat (ISSR) analysis. The amplicons of 13 out of 15 designed primers were stable polymorphic and therefore were used as genetic biomarkers. 135 total clear bands were obtained, of which 102 were polymorphic bands ...

  6. sequence stratigraphy and structural analysis of the emi field ...

    African Journals Online (AJOL)

    Timothy Ademakinwa

    SEQUENCE STRATIGRAPHY AND STRUCTURAL ANALYSIS OF THE EMI FIELD,. OFFSHORE DEPOBELT, EASTERN NIGER DELTA BASIN, NIGERIA. 1*. 2. 2. Oresajo, B. S. , Adekeye, A. O. and Haruna, K. A.. 1Dept. of Geology, Federal University Birnin Kebbi, Birnin Kebbi. 2Dept. of Geology, University of Ilorin, Ilorin.

  7. Culture-independent analysis of liver abscess using nanopore sequencing.

    Science.gov (United States)

    Gong, Liang; Huang, Yao-Ting; Wong, Chee-Hong; Chao, Wen-Cheng; Wu, Zong-Yen; Wei, Chia-Lin; Liu, Po-Yu

    2018-01-01

    The identification of microbial species has depended predominantly upon culture-based techniques. However, the difficulty with which types of organisms are cultured implies that the grown species may be overrepresented by both cultivation and plate counts. In recent years, culture-independent analysis using high-throughput sequencing has been advocated for use as a point-of-care diagnostic tool. Although it offers a rapid and unbiased survey to characterize the pathogens in clinical specimens, its accuracy is reduced by the high level of contamination of human DNA. In this paper, we propose using a culture-independent analysis for a Klebsiella pneumoniae clinical strain within a liver abscess using nanopore sequencing. Owing to the highly-contaminated cell population within a liver abscess, we managed to reduce the confounding effects of human DNA through the use of DNase and differential centrifugation. Genomic DNA was sequenced through the use of Nanopore MinION sequencer and analyzed using a suite of bioinformatics approaches. K. pneumoniae was successfully identified along with antibiotic-resistant genes. Our results indicate that, by integrating real-time nanopore sequencing and bioinformatics software, real-time pathogen identification in a liver abscess can be achieved.

  8. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    Directory of Open Access Journals (Sweden)

    T. M. Inbamalar

    2015-01-01

    Full Text Available Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA, the ribonucleic acid (RNA, and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  9. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    Science.gov (United States)

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  10. Interactions of chromatin context, binding site sequence content, and sequence evolution in stress-induced p53 occupancy and transactivation.

    Directory of Open Access Journals (Sweden)

    Dan Su

    2015-01-01

    Full Text Available Cellular stresses activate the tumor suppressor p53 protein leading to selective binding to DNA response elements (REs and gene transactivation from a large pool of potential p53 REs (p53REs. To elucidate how p53RE sequences and local chromatin context interact to affect p53 binding and gene transactivation, we mapped genome-wide binding localizations of p53 and H3K4me3 in untreated and doxorubicin (DXR-treated human lymphoblastoid cells. We examined the relationships among p53 occupancy, gene expression, H3K4me3, chromatin accessibility (DNase 1 hypersensitivity, DHS, ENCODE chromatin states, p53RE sequence, and evolutionary conservation. We observed that the inducible expression of p53-regulated genes was associated with the steady-state chromatin status of the cell. Most highly inducible p53-regulated genes were suppressed at baseline and marked by repressive histone modifications or displayed CTCF binding. Comparison of p53RE sequences residing in different chromatin contexts demonstrated that weaker p53REs resided in open promoters, while stronger p53REs were located within enhancers and repressed chromatin. p53 occupancy was strongly correlated with similarity of the target DNA sequences to the p53RE consensus, but surprisingly, inversely correlated with pre-existing nucleosome accessibility (DHS and evolutionary conservation at the p53RE. Occupancy by p53 of REs that overlapped transposable element (TE repeats was significantly higher (p<10-7 and correlated with stronger p53RE sequences (p<10-110 relative to nonTE-associated p53REs, particularly for MLT1H, LTR10B, and Mer61 TEs. However, binding at these elements was generally not associated with transactivation of adjacent genes. Occupied p53REs located in L2-like TEs were unique in displaying highly negative PhyloP scores (predicted fast-evolving and being associated with altered H3K4me3 and DHS levels. These results underscore the systematic interaction between chromatin status and p53

  11. Complete genome sequence analysis of chicken astrovirus isolate from India.

    Science.gov (United States)

    Patel, Amrutlal K; Pandit, Ramesh J; Thakkar, Jalpa R; Hinsu, Ankit T; Pandey, Vinod C; Pal, Joy K; Prajapati, Kantilal S; Jakhesara, Subhash J; Joshi, Chaitanya G

    2017-03-01

    Chicken astroviruses have been known to cause severe disease in chickens leading to increased mortality and "white chicks" condition. Here we aim to characterize the causative agent of visceral gout suspected for astrovirus infection in broiler breeder chickens. Total RNA isolated from allantoic fluid of SPF embryo passaged with infected chicken sample was sequenced by whole genome shotgun sequencing using ion-torrent PGM platform. The sequence was analysed for the presence of coding and non-coding features, its similarity with reported isolates and epitope analysis of capsid structural protein. The consensus length of 7513 bp genome sequence of Indian isolate of chicken astrovirus was obtained after assembly of 14,121 high quality reads. The genome was comprised of 13 bp 5'-UTR, three open reading frames (ORFs) including ORF1a encoding serine protease, ORF1b encoding RNA dependent RNA polymerase (RdRp) and ORF2 encoding capsid protein, and 298 bp of 3'-UTR which harboured two corona virus stem loop II like "s2m" motifs and a poly A stretch of 19 nucleotides. The genetic analysis of CAstV/INDIA/ANAND/2016 suggested highest sequence similarity of 86.94% with the chicken astrovirus isolate CAstV/GA2011 followed by 84.76% with CAstV/4175 and 74.48%% with CAstV/Poland/G059/2014 isolates. The capsid structural protein of CAstV/INDIA/ANAND/2016 showed 84.67% similarity with chicken astrovirus isolate CAstV/GA2011, 81.06% with CAstV/4175 and 41.18% with CAstV/Poland/G059/2014 isolates. However, the capsid protein sequence showed high degree of sequence identity at nucleotide level (98.64-99.32%) and at amino acids level (97.74-98.69%) with reported sequences of Indian isolates suggesting their common origin and limited sequence divergence. The epitope analysis by SVMTriP identified two unique epitopes in our isolate, seven shared epitopes among Indian isolates and two shared epitopes among all isolates except Poland isolate which carried all distinct epitopes.

  12. Community evolution mining and analysis in social network

    Science.gov (United States)

    Liu, Hongtao; Tian, Yuan; Liu, Xueyan; Jian, Jie

    2017-03-01

    With the development of digital and network technology, various social platforms emerge. These social platforms have greatly facilitated access to information, attracting more and more users. They use these social platforms every day to work, study and communicate, so every moment social platforms are generating massive amounts of data. These data can often be modeled as complex networks, making large-scale social network analysis possible. In this paper, the existing evolution classification model of community has been improved based on community evolution relationship over time in dynamic social network, and the Evolution-Tree structure is proposed which can show the whole life cycle of the community more clearly. The comparative test result shows that the improved model can excavate the evolution relationship of the community well.

  13. Comparative Genomic Sequence Analysis of the Human Chromosome 21 Down Syndrome Critical Region

    Science.gov (United States)

    Toyoda, Atsushi; Noguchi, Hideki; Taylor, Todd D.; Ito, Takehiko; Pletcher, Mathew T.; Sakaki, Yoshiyuki; Reeves, Roger H.; Hattori, Masahira

    2002-01-01

    Comprehensive knowledge of the gene content of human chromosome 21 (HSA21) is essential for understanding the etiology of Down syndrome (DS). Here we report the largest comparison of finished mouse and human sequence to date for a 1.35-Mb region of mouse chromosome 16 (MMU16) that corresponds to human chromosome 21q22.2. This includes a portion of the commonly described “DS critical region,” thought to contain a gene or genes whose dosage imbalance contributes to a number of phenotypes associated with DS. We used comparative sequence analysis to construct a DNA feature map of this region that includes all known genes, plus 144 conserved sequences ≥100 bp long that show ≥80% identity between mouse and human but do not match known exons. Twenty of these have matches to expressed sequence tag and cDNA databases, indicating that they may be transcribed sequences from chromosome 21. Eight putative CpG islands are found at conserved positions. Models for two human genes, DSCR4 and DSCR8, are not supported by conserved sequence, and close examination indicates that low-level transcripts from these loci are unlikely to encode proteins. Gene prediction programs give different results when used to analyze the well-conserved regions between mouse and human sequences. Our findings have implications for evolution and for modeling the genetic basis of DS in mice. [Sequence data described in this paper have been submitted to the DDBJ/GenBank under accession nos. AP003148 through AP003158, and AB066227. Supplemental material is available at http://www.genome.org.] PMID:12213769

  14. Phylogeny and classification of Dickeya based on multilocus sequence analysis.

    Science.gov (United States)

    Marrero, Glorimar; Schneider, Kevin L; Jenkins, Daniel M; Alvarez, Anne M

    2013-09-01

    Bacterial heart rot of pineapple reported in Hawaii in 2003 and reoccurring in 2006 was caused by an undetermined species of Dickeya. Classification of the bacterial strains isolated from infected pineapple to one of the recognized Dickeya species and their phylogenetic relationships with Dickeya were determined by a multilocus sequence analysis (MLSA), based on the partial gene sequences of dnaA, dnaJ, dnaX, gyrB and recN. Individual and concatenated gene phylogenies revealed that the strains form a clade with reference Dickeya sp. isolated from pineapple in Malaysia and are closely related to D. zeae; however, previous DNA-DNA reassociation values suggest that these strains do not meet the genomic threshold for consideration in D. zeae, and require further taxonomic analysis. An analysis of the markers used in this MLSA determined that recN was the best overall marker for resolution of species within Dickeya. Differential intraspecies resolution was observed with the other markers, suggesting that marker selection is important for defining relationships within a clade. Phylogenies produced with gene sequences from the sequenced genomes of strains D. dadantii Ech586, D. dadantii Ech703 and D. zeae Ech1591 did not place the sequenced strains with members of other well-characterized members of their respective species. The average nucleotide identity (ANI) and tetranucleotide frequencies determined for the sequenced strains corroborated the results of the MLSA that D. dadantii Ech586 and D. dadantii Ech703 should be reclassified as Dickeya zeae Ech586 and Dickeya paradisiaca Ech703, respectively, whereas D. zeae Ech1591 should be reclassified as Dickeya chrysanthemi Ech1591.

  15. Origin and evolution of the transcribed repeated sequences of the Y chromosome lampbrush loops of Drosophila hydei

    OpenAIRE

    Hareven, Dana; Zuckerman, Mathi; Lifschytz, Eliezer

    1986-01-01

    The molecular evolution and patterns of conservation of clones from four Y chromosome lampbrush loops of Drosophila hydei were investigated. Each loop contains a discrete family of transcribed repeats that are only slightly conserved even in the hydei subgroup species. Sequencing of clones from the four D. hydei loops indicates that all transcribed repeats evolved from A+T-rich elements of the genome. Evidence is presented that suggests a Y-specific family evolved as a result of the transposi...

  16. Analysis of Genotyping-by-Sequencing (GBS) Data.

    Science.gov (United States)

    Kagale, Sateesh; Koh, Chushin; Clarke, Wayne E; Bollina, Venkatesh; Parkin, Isobel A P; Sharpe, Andrew G

    2016-01-01

    The development of genotyping-by-sequencing (GBS) to rapidly detect nucleotide variation at the whole genome level, in many individuals simultaneously, has provided a transformative genetic profiling technique. GBS can be carried out in species with or without reference genome sequences yields huge amounts of potentially informative data. One limitation with the approach is the paucity of tools to transform the raw data into a format that can be easily interrogated at the genetic level. In this chapter we describe bioinformatics tools developed to address this shortfall together with experimental design considerations to fully leverage the power of GBS for genetic analysis.

  17. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  18. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    over all chromosomes of H. vulgare and the wild barley species H. bulbosum, H. marinum and H. murinum. Southern blot hybridization revealed different levels of polymorphism among barley species and the RFLP data were used to generate a phylogenetic tree for the genus Hordeum. Our data are in a good......A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  19. Post-main Sequence Evolution of Icy Minor Planets: Implications for Water Retention and White Dwarf Pollution

    Science.gov (United States)

    Malamud, Uri; Perets, Hagai B.

    2016-12-01

    Most observations of polluted white dwarf atmospheres are consistent with accretion of water-depleted planetary material. Among tens of known cases, merely two involve accretion of objects that contain a considerable mass fraction of water. The purpose of this study is to investigate the relative scarcity of these detections. Based on a new and highly detailed model, we evaluate the retention of water inside icy minor planets during the high-luminosity stellar evolution that follows the main sequence. Our model fully considers the thermal, physical, and chemical evolution of icy bodies, following their internal differentiation as well as water depletion, from the moment of their birth and through all stellar evolution phases preceding the formation of the white dwarf. We also account for different initial compositions and formation times. Our results differ from previous studies, which have either underestimated or overestimated water retention. We show that water can survive in a variety of circumstances and in great quantities, and therefore other possibilities are discussed in order to explain the infrequency of water detection. We predict that the sequence of accretion is such that water accretes earlier, and more rapidly, than the rest of the silicate disk, considerably reducing the chance of its detection in H-dominated atmospheres. In He-dominated atmospheres, the scarcity of water detections could be observationally biased. It implies that the accreted material is typically intrinsically dry, which may be the result of the inside-out depopulation sequence of minor planets.

  20. Multilocus sequence analysis of phytopathogenic species of the genus Streptomyces.

    Science.gov (United States)

    Labeda, David P

    2011-10-01

    The identification and classification of species within the genus Streptomyces is difficult because there are presently 576 species with validly published names and this number increases every year. The value of multilocus sequence analysis applied to the systematics of Streptomyces species has been well demonstrated in several recently published papers. In this study the sequence fragments of four housekeeping genes, atpD, recA, rpoB and trpB, were determined for the type strains of 10 known phytopathogenic species of the genus Streptomyces, including Streptomyces scabiei, Streptomyces acidiscabies, Streptomyces europaeiscabiei, Streptomyces luridiscabiei, Streptomyces niveiscabiei, Streptomyces puniciscabiei, Streptomyces reticuliscabiei, Streptomyces stelliscabiei, Streptomyces turgidiscabies and Streptomyces ipomoeae, as well as six uncharacterized phytopathogenic Streptomyces isolates. The type strains of 52 other species, including 19 species observed to be phylogenetically closely related to these, based on 16S rRNA gene sequence analysis, were also included in the study. Phylogenetic analysis of single gene alignments and a concatenated four-gene alignment demonstrated that the phytopathogenic species are taxonomically distinct from each other in spite of high 16S rRNA gene sequence similarities and provided a tool for the identification of unknown putative phytopathogenic Streptomyces strains at the species level.

  1. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  2. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data.

    Science.gov (United States)

    Thiel, William H

    2016-01-01

    Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment). High-throughput sequencing (HTS) revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs. Copyright © 2016 Official journal of the American Society of Gene & Cell Therapy. Published by Elsevier Inc. All rights reserved.

  3. Importance of purine and pyrimidine content of local nucleotide sequences (six bases long) for evolution of the human immunodeficiency virus type 1.

    Science.gov (United States)

    Doi, H

    1991-10-15

    Human immunodeficiency virus type 1 evolves rapidly, and random base change is thought to act as a major factor in this evolution. However, segments of the viral genome differ in their variability: there is the highly variable env gene, particularly hypervariable regions located within env, and, in contrast, the conservative gag and pol genes. Computer analysis of the nucleotide sequences of human immunodeficiency virus type 1 isolates reveals that base substitution in this virus is nonrandom and affected by local nucleotide sequences. Certain local sequences 6 base pairs long are excessively frequent in the hypervariable regions. These sequences exhibit base-substitution hotspots at specific positions in their 6 bases. The hotspots tend to be nonsilent letters of codons in the hypervariable regions--thus leading to marked amino acid substitutions there. Conversely, in the conservative gag and pol genes the hotspots tend to be silent letters because of a difference in codon frame from the hypervariable regions. Furthermore, base substitutions in the local sequences that frequently appear in the conservative genes occurred at a low level, even within the variable env. Thus, despite the high variability of this virus, the conservative genes and their products could be conserved. These may be some of the strategies evolved in human immunodeficiency virus type 1 to allow for positive-selection pressures, such as the host immune system, and negative-selection pressures on the conservative gene products.

  4. Repetitive sequence analysis and karyotyping reveals centromere-associated DNA sequences in radish (Raphanus sativus L.).

    Science.gov (United States)

    He, Qunyan; Cai, Zexi; Hu, Tianhua; Liu, Huijun; Bao, Chonglai; Mao, Weihai; Jin, Weiwei

    2015-04-18

    Radish (Raphanus sativus L., 2n = 2x = 18) is a major root vegetable crop especially in eastern Asia. Radish root contains various nutritions which play an important role in strengthening immunity. Repetitive elements are primary components of the genomic sequence and the most important factors in genome size variations in higher eukaryotes. To date, studies about repetitive elements of radish are still limited. To better understand genome structure of radish, we undertook a study to evaluate the proportion of repetitive elements and their distribution in radish. We conducted genome-wide characterization of repetitive elements in radish with low coverage genome sequencing followed by similarity-based cluster analysis. Results showed that about 31% of the genome was composed of repetitive sequences. Satellite repeats were the most dominating elements of the genome. The distribution pattern of three satellite repeat sequences (CL1, CL25, and CL43) on radish chromosomes was characterized using fluorescence in situ hybridization (FISH). CL1 was predominantly located at the centromeric region of all chromosomes, CL25 located at the subtelomeric region, and CL43 was a telomeric satellite. FISH signals of two satellite repeats, CL1 and CL25, together with 5S rDNA and 45S rDNA, provide useful cytogenetic markers to identify each individual somatic metaphase chromosome. The centromere-specific histone H3 (CENH3) has been used as a marker to identify centromere DNA sequences. One putative CENH3 (RsCENH3) was characterized and cloned from radish. Its deduced amino acid sequence shares high similarities to those of the CENH3s in Brassica species. An antibody against B. rapa CENH3, specifically stained radish centromeres. Immunostaining and chromatin immunoprecipitation (ChIP) tests with anti-BrCENH3 antibody demonstrated that both the centromere-specific retrotransposon (CR-Radish) and satellite repeat (CL1) are directly associated with RsCENH3 in radish. Proportions

  5. The sequence and analysis of duplication rich human chromosome 16

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

    2004-08-01

    We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.

  6. The Sequence and Analysis of Duplication Rich Human Chromosome 16

    Science.gov (United States)

    Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

    2004-01-01

    We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.

  7. DNA sequence analysis using hierarchical ART-based classification networks

    Energy Technology Data Exchange (ETDEWEB)

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  8. Analysis of Sequence Diagram Layout in Advanced UML Modelling Tools

    Directory of Open Access Journals (Sweden)

    Ņikiforova Oksana

    2016-05-01

    Full Text Available System modelling using Unified Modelling Language (UML is the task that should be solved for software development. The more complex software becomes the higher requirements are stated to demonstrate the system to be developed, especially in its dynamic aspect, which in UML is offered by a sequence diagram. To solve this task, the main attention is devoted to the graphical presentation of the system, where diagram layout plays the central role in information perception. The UML sequence diagram due to its specific structure is selected for a deeper analysis on the elements’ layout. The authors research represents the abilities of modern UML modelling tools to offer automatic layout of the UML sequence diagram and analyse them according to criteria required for the diagram perception.

  9. Sequence analysis reveals mosaic genome of Aichi virus

    Directory of Open Access Journals (Sweden)

    Han Xiaohong

    2011-08-01

    Full Text Available Abstract Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity. Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood. The genome of Aichi virus contains 8,280 nucleotides and a poly(A tail. The single large open reading frame (nt 713-8014 according to the strain AB010145 encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  10. Sequence analysis reveals mosaic genome of Aichi virus.

    Science.gov (United States)

    Han, Xiaohong; Zhang, Wen; Xue, Yanjun; Shao, Shihe

    2011-08-05

    Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file) shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity.Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood.The genome of Aichi virus contains 8,280 nucleotides and a poly(A) tail. The single large open reading frame (nt 713-8014 according to the strain AB010145) encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD) junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  11. Molecular cloning, sequence characteristics, and tissue expression analysis of ECE1 gene in Tibetan pig.

    Science.gov (United States)

    Wang, Yan-Dong; Zhang, Jian; Li, Chuan-Hao; Xu, Hai-Peng; Chen, Wei; Zeng, Yong-Qing; Wang, Hui

    2015-10-25

    Low air pressure and low oxygen partial pressure at high altitude seriously affect the survival and development of human beings and animals. ECE1 is a recently discovered gene that is involved in anti-hypoxia, but the full-length cDNA sequence has not been obtained. For a better understanding of the structure and function of the ECE1 gene and to study its effect in Tibetan pig, the cDNA of the ECE1 gene from the muscle of Tibetan pig was cloned, sequenced and characterized. The ECE1 full-length cDNA sequence consists of 2262 bp coding sequence (CDS) that encodes 753 amino acids with a molecular mass of 85,449 kD, 2 bp 5'UTR and 1507 bp 3'UTR. In addition, the phylogenetic tree analysis revealed that the Tibetan pig ECE1 has a closer genetic relationship and evolution distance with the land mammals ECE1. Furthermore, analysis by qPCR showed that the ECE1 transcript is constitutively expressed in the 10 tissues tested: the liver, subcutaneous fat, kidney, muscle, stomach, heart, brain, spleen, pancreas, and lung. These results serve as a foundation for further insight into the Tibetan pig ECE1 gene. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  13. Distribution and evolution of repeated sequences in genomes of Triatominae (Hemiptera-Reduviidae inferred from genomic in situ hybridization.

    Directory of Open Access Journals (Sweden)

    Sebastian Pita

    Full Text Available The subfamily Triatominae, vectors of Chagas disease, comprises 140 species characterized by a highly homogeneous chromosome number. We analyzed the chromosomal distribution and evolution of repeated sequences in Triatominae genomes by Genomic in situ Hybridization using Triatoma delpontei and Triatoma infestans genomic DNAs as probes. Hybridizations were performed on their own chromosomes and on nine species included in six genera from the two main tribes: Triatomini and Rhodniini. Genomic probes clearly generate two different hybridization patterns, dispersed or accumulated in specific regions or chromosomes. The three used probes generate the same hybridization pattern in each species. However, these patterns are species-specific. In closely related species, the probes strongly hybridized in the autosomal heterochromatic regions, resembling C-banding and DAPI patterns. However, in more distant species these co-localizations are not observed. The heterochromatic Y chromosome is constituted by highly repeated sequences, which is conserved among 10 species of Triatomini tribe suggesting be an ancestral character for this group. However, the Y chromosome in Rhodniini tribe is markedly different, supporting the early evolutionary dichotomy between both tribes. In some species, sex chromosomes and autosomes shared repeated sequences, suggesting meiotic chromatin exchanges among these heterologous chromosomes. Our GISH analyses enabled us to acquire not only reliable information about autosomal repeated sequences distribution but also an insight into sex chromosome evolution in Triatominae. Furthermore, the differentiation obtained by GISH might be a valuable marker to establish phylogenetic relationships and to test the controversial origin of the Triatominae subfamily.

  14. Reconstructing SALMFamide Neuropeptide Precursor Evolution in the Phylum Echinodermata: Ophiuroid and Crinoid Sequence Data Provide New Insights.

    Science.gov (United States)

    Elphick, Maurice R; Semmens, Dean C; Blowes, Liisa M; Levine, Judith; Lowe, Christopher J; Arnone, Maria I; Clark, Melody S

    2015-01-01

    The SALMFamides are a family of neuropeptides that act as muscle relaxants in echinoderms. Analysis of genome/transcriptome sequence data from the sea urchin Strongylocentrotus purpuratus (Echinoidea), the sea cucumber Apostichopus japonicus (Holothuroidea), and the starfish Patiria miniata (Asteroidea) reveals that in each species there are two types of SALMFamide precursor: an L-type precursor comprising peptides with a C-terminal LxFamide-type motif and an F-type precursor solely or largely comprising peptides with a C-terminal FxFamide-type motif. Here, we have identified transcripts encoding SALMFamide precursors in the brittle star Ophionotus victoriae (Ophiuroidea) and the feather star Antedon mediterranea (Crinoidea). We have also identified SALMFamide precursors in other species belonging to each of the five echinoderm classes. As in S. purpuratus, A. japonicus, and P. miniata, in O. victoriae there is one L-type precursor and one F-type precursor. However, in A. mediterranea only a single SALMFamide precursor was found, comprising two peptides with a LxFamide-type motif, one with a FxFamide-type motif, five with a FxLamide-type motif, and four with a LxLamide-type motif. As crinoids are basal to the Echinozoa (Holothuroidea + Echinoidea) and Asterozoa (Asteroidea + Ophiuroidea) in echinoderm phylogeny, one model of SALMFamide precursor evolution would be that ancestrally there was a single SALMFamide gene encoding a variety of SALMFamides (as in crinoids), which duplicated in a common ancestor of the Echinozoa and Asterozoa and then specialized to encode L-type SALMFamides or F-type SALMFamides. Alternatively, a second SALMFamide precursor may remain to be discovered or may have been lost in crinoids. Further insights will be obtained if SALMFamide receptors are identified, which would provide a molecular basis for experimental analysis of the functional significance of the "cocktails" of SALMFamides that exist in echinoderms.

  15. Reconstructing SALMFamide neuropeptide precursor evolution in the phylum Echinodermata: ophiuroid and crinoid sequence data provide new insights

    Directory of Open Access Journals (Sweden)

    Maurice R Elphick

    2015-02-01

    Full Text Available The SALMFamides are a family of neuropeptides that act as muscle relaxants in echinoderms. Analysis of genome/transcriptome sequence data from the sea urchin Strongylocentrotus purpuratus (Echinoidea, the sea cucumber Apostichopus japonicus (Holothuroidea and the starfish Patiria miniata (Asteroidea reveals that in each species there are two types of SALMFamide precursor: an L-type precursor comprising peptides with a C-terminal LxFamide-type motif and an F-type precursor solely or largely comprising peptides with a C-terminal FxFamide-type motif. Here we have identified transcripts encoding SALMFamide precursors in the brittle star Ophionotus victoriae (Ophiuroidea and the feather star Antedon mediterranea (Crinoidea. We have also identified SALMFamide precursors in other species belonging to each of the five echinoderm classes. As in S. purpuratus, A. japonicus and P. miniata, in O. victoriae there is one L-type precursor and one F-type precursor. However, in A. mediterranea only a single SALMFamide precursor was found, comprising two peptides with a LxFamide-type motif, one with a FxFamide-type motif, five with a FxLamide-type motif and four with a LxLamide-type motif. As crinoids are basal to the Echinozoa (Holothuroidea + Echinoidea and Asterozoa (Asteroidea + Ophiuroidea in echinoderm phylogeny, one model of SALMFamide precursor evolution would be that ancestrally there was a single SALMFamide gene encoding a variety of SALMFamides (as in crinoids, which duplicated in a common ancestor of the Echinozoa and Asterozoa and then specialised to encode L-type SALMFamides or F-type SALMFamides. Alternatively, a second SALMFamide precursor may remain to be discovered or may have been lost in crinoids. Further insights will be obtained if SALMFamide receptors are identified, which would provide a molecular basis for experimental analysis of the functional significance of the cocktails of SALMFamides that exist in echinoderms.

  16. Multilocus sequence analysis of Treponema denticola strains of diverse origin

    Directory of Open Access Journals (Sweden)

    Mo Sisu

    2013-02-01

    Full Text Available Abstract Background The oral spirochete bacterium Treponema denticola is associated with both the incidence and severity of periodontal disease. Although the biological or phenotypic properties of a significant number of T. denticola isolates have been reported in the literature, their genetic diversity or phylogeny has never been systematically investigated. Here, we describe a multilocus sequence analysis (MLSA of 20 of the most highly studied reference strains and clinical isolates of T. denticola; which were originally isolated from subgingival plaque samples taken from subjects from China, Japan, the Netherlands, Canada and the USA. Results The sequences of the 16S ribosomal RNA gene, and 7 conserved protein-encoding genes (flaA, recA, pyrH, ppnK, dnaN, era and radC were successfully determined for each strain. Sequence data was analyzed using a variety of bioinformatic and phylogenetic software tools. We found no evidence of positive selection or DNA recombination within the protein-encoding genes, where levels of intraspecific sequence polymorphism varied from 18.8% (flaA to 8.9% (dnaN. Phylogenetic analysis of the concatenated protein-encoding gene sequence data (ca. 6,513 nucleotides for each strain using Bayesian and maximum likelihood approaches indicated that the T. denticola strains were monophyletic, and formed 6 well-defined clades. All analyzed T. denticola strains appeared to have a genetic origin distinct from that of ‘Treponema vincentii’ or Treponema pallidum. No specific geographical relationships could be established; but several strains isolated from different continents appear to be closely related at the genetic level. Conclusions Our analyses indicate that previous biological and biophysical investigations have predominantly focused on a subset of T. denticola strains with a relatively narrow range of genetic diversity. Our methodology and results establish a genetic framework for the discrimination and phylogenetic

  17. An Imaging And Graphics Workstation For Image Sequence Analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-01-01

    This paper describes an application-specific engineering workstation designed and developed to analyze imagery sequences from a variety of sources. The system combines the software and hardware environment of the modern graphic-oriented workstations with the digital image acquisition, processing and display techniques. The objective is to achieve automation and high throughput for many data reduction tasks involving metric studies of image sequences. The applications of such an automated data reduction tool include analysis of the trajectory and attitude of aircraft, missile, stores and other flying objects in various flight regimes including launch and separation as well as regular flight maneuvers. The workstation can also be used in an on-line or off-line mode to study three-dimensional motion of aircraft models in simulated flight conditions such as wind tunnels. The system's key features are: 1) Acquisition and storage of image sequences by digitizing real-time video or frames from a film strip; 2) computer-controlled movie loop playback, slow motion and freeze frame display combined with digital image sharpening, noise reduction, contrast enhancement and interactive image magnification; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored image sequence; 4) automatic and manual field-of-view and spatial calibration; 5) image sequence data base generation and management, including the measurement data products; 6) off-line analysis software for trajectory plotting and statistical analysis; 7) model-based estimation and tracking of object attitude angles; and 8) interface to a variety of video players and film transport sub-systems.

  18. Castor bean organelle genome sequencing and worldwide genetic diversity analysis.

    Directory of Open Access Journals (Sweden)

    Maximo Rivarola

    Full Text Available Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade.

  19. Infrared thermal facial image sequence registration analysis and verification

    Science.gov (United States)

    Chen, Chieh-Li; Jian, Bo-Lin

    2015-03-01

    To study the emotional responses of subjects to the International Affective Picture System (IAPS), infrared thermal facial image sequence is preprocessed for registration before further analysis such that the variance caused by minor and irregular subject movements is reduced. Without affecting the comfort level and inducing minimal harm, this study proposes an infrared thermal facial image sequence registration process that will reduce the deviations caused by the unconscious head shaking of the subjects. A fixed image for registration is produced through the localization of the centroid of the eye region as well as image translation and rotation processes. Thermal image sequencing will then be automatically registered using the two-stage genetic algorithm proposed. The deviation before and after image registration will be demonstrated by image quality indices. The results show that the infrared thermal image sequence registration process proposed in this study is effective in localizing facial images accurately, which will be beneficial to the correlation analysis of psychological information related to the facial area.

  20. Congruence analysis of point clouds from unstable stereo image sequences

    Directory of Open Access Journals (Sweden)

    C. Jepping

    2014-06-01

    Full Text Available This paper deals with the correction of exterior orientation parameters of stereo image sequences over deformed free-form surfaces without control points. Such imaging situation can occur, for example, during photogrammetric car crash test recordings where onboard high-speed stereo cameras are used to measure 3D surfaces. As a result of such measurements 3D point clouds of deformed surfaces are generated for a complete stereo sequence. The first objective of this research focusses on the development and investigation of methods for the detection of corresponding spatial and temporal tie points within the stereo image sequences (by stereo image matching and 3D point tracking that are robust enough for a reliable handling of occlusions and other disturbances that may occur. The second objective of this research is the analysis of object deformations in order to detect stable areas (congruence analysis. For this purpose a RANSAC-based method for congruence analysis has been developed. This process is based on the sequential transformation of randomly selected point groups from one epoch to another by using a 3D similarity transformation. The paper gives a detailed description of the congruence analysis. The approach has been tested successfully on synthetic and real image data.

  1. A stochastic model for EEG microstate sequence analysis.

    Science.gov (United States)

    Gärtner, Matthias; Brodbeck, Verena; Laufs, Helmut; Schneider, Gaby

    2015-01-01

    The analysis of spontaneous resting state neuronal activity is assumed to give insight into the brain function. One noninvasive technique to study resting state activity is electroencephalography (EEG) with a subsequent microstate analysis. This technique reduces the recorded EEG signal to a sequence of prototypical topographical maps, which is hypothesized to capture important spatio-temporal properties of the signal. In a statistical EEG microstate analysis of healthy subjects in wakefulness and three stages of sleep, we observed a simple structure in the microstate transition matrix. It can be described with a first order Markov chain in which the transition probability from the current state (i.e., map) to a different map does not depend on the current map. The resulting transition matrix shows a high agreement with the observed transition matrix, requiring only about 2% of mass transport (1/2 L1-distance). In the second part, we introduce an extended framework in which the simple Markov chain is used to make inferences on a potential underlying time continuous process. This process cannot be directly observed and is therefore usually estimated from discrete sampling points of the EEG signal given by the local maxima of the global field power. Therefore, we propose a simple stochastic model called sampled marked intervals (SMI) model that relates the observed sequence of microstates to an assumed underlying process of background intervals and thus, complements approaches that focus on the analysis of observable microstate sequences. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.

    Science.gov (United States)

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.

  3. Now And Next Generation Sequencing Techniques: Future of Sequence Analysis using Cloud Computing

    Directory of Open Access Journals (Sweden)

    Radhe Shyam Thakur

    2012-12-01

    Full Text Available Advancements in the field of sequencing techniques resulted in the huge sequenced data to be produced at a very faster rate. It is going cumbersome for the datacenter to maintain the databases. Data mining and sequence analysis approaches needs to analyze the databases several times to reach any efficient conclusion. To cope with such overburden on computer resources and to reach efficient and effective conclusions quickly, the virtualization of the resources and computation on pay as you go concept was introduced and termed as cloud computing. The datacenter’s hardware and software is collectively known as cloud which when available publicly is termed as public cloud. The datacenter’s resources are provided in a virtual mode to the clients via a service provider like Amazon, Google and Joyent which charges on pay as you go manner. The workload is shifted to the provider which is maintained by the required hardware and software upgradation. The service provider manages it by upgrading the requirements in the virtual mode. Basically a virtual environment is created according to the need of the user by taking permission from datacenter via internet, the task is performed and the environment is deleted after the task is over. In this discussion, we are focusing on the basics of cloud computing, the prerequisites and overall working of clouds. Furthermore, briefly the applications of cloud computing in biological systems, especially in comparative genomics, genome informatics and SNP detection with reference to traditional workflow are discussed.

  4. SEQUENCING AND SEQUENCE ANALYSIS OF MYOSTATIN GENE IN THE EXON 1 OF THE CAMEL (CAMELUS DROMEDARIUS

    Directory of Open Access Journals (Sweden)

    M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3

    2006-10-01

    Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.

  5. Complete plastome sequences of Equisetum arvense and Isoetes flaccida: implications for phylogeny and plastid genome evolution of early land plant lineages

    Directory of Open Access Journals (Sweden)

    Mandoli Dina F

    2010-10-01

    Full Text Available Abstract Background Despite considerable progress in our understanding of land plant phylogeny, several nodes in the green tree of life remain poorly resolved. Furthermore, the bulk of currently available data come from only a subset of major land plant clades. Here we examine early land plant evolution using complete plastome sequences including two previously unexamined and phylogenetically critical lineages. To better understand the evolution of land plants and their plastomes, we examined aligned nucleotide sequences, indels, gene and nucleotide composition, inversions, and gene order at the boundaries of the inverted repeats. Results We present the plastome sequences of Equisetum arvense, a horsetail, and of Isoetes flaccida, a heterosporous lycophyte. Phylogenetic analysis of aligned nucleotides from 49 plastome genes from 43 taxa supported monophyly for the following clades: embryophytes (land plants, lycophytes, monilophytes (leptosporangiate ferns + Angiopteris evecta + Psilotum nudum + Equisetum arvense, and seed plants. Resolution among the four monilophyte lineages remained moderate, although nucleotide analyses suggested that P. nudum and E. arvense form a clade sister to A. evecta + leptosporangiate ferns. Results from phylogenetic analyses of nucleotides were consistent with the distribution of plastome gene rearrangements and with analysis of sequence gaps resulting from insertions and deletions (indels. We found one new indel and an inversion of a block of genes that unites the monilophytes. Conclusions Monophyly of monilophytes has been disputed on the basis of morphological and fossil evidence. In the context of a broad sampling of land plant data we find several new pieces of evidence for monilophyte monophyly. Results from this study demonstrate resolution among the four monilophytes lineages, albeit with moderate support; we posit a clade consisting of Equisetaceae and Psilotaceae that is sister to the "true ferns

  6. CISAPS: Complex Informational Spectrum for the Analysis of Protein Sequences

    Directory of Open Access Journals (Sweden)

    Charalambos Chrysostomou

    2015-01-01

    Full Text Available Complex informational spectrum analysis for protein sequences (CISAPS and its web-based server are developed and presented. As recent studies show, only the use of the absolute spectrum in the analysis of protein sequences using the informational spectrum analysis is proven to be insufficient. Therefore, CISAPS is developed to consider and provide results in three forms including absolute, real, and imaginary spectrum. Biologically related features to the analysis of influenza A subtypes as presented as a case study in this study can also appear individually either in the real or imaginary spectrum. As the results presented, protein classes can present similarities or differences according to the features extracted from CISAPS web server. These associations are probable to be related with the protein feature that the specific amino acid index represents. In addition, various technical issues such as zero-padding and windowing that may affect the analysis are also addressed. CISAPS uses an expanded list of 611 unique amino acid indices where each one represents a different property to perform the analysis. This web-based server enables researchers with little knowledge of signal processing methods to apply and include complex informational spectrum analysis to their work.

  7. Comparative analysis of full genomic sequences among different genotypes of dengue virus type 3

    Directory of Open Access Journals (Sweden)

    Lin Ting-Hsiang

    2008-05-01

    Full Text Available Abstract Background Although the previous study demonstrated the envelope protein of dengue viruses is under purifying selection pressure, little is known about the genetic differences of full-length viral genomes of DENV-3. In our study, complete genomic sequencing of DENV-3 strains collected from different geographical locations and isolation years were determined and the sequence diversity as well as selection pressure sites in the DENV genome other than within the E gene were also analyzed. Results Using maximum likelihood and Bayesian approaches, our phylogenetic analysis revealed that the Taiwan's indigenous DENV-3 isolated from 1994 and 1998 dengue/DHF epidemics and one 1999 sporadic case were of the three different genotypes – I, II, and III, each associated with DENV-3 circulating in Indonesia, Thailand and Sri Lanka, respectively. Sequence diversity and selection pressure of different genomic regions among DENV-3 different genotypes was further examined to understand the global DENV-3 evolution. The highest nucleotide sequence diversity among the fully sequenced DENV-3 strains was found in the nonstructural protein 2A (mean ± SD: 5.84 ± 0.54 and envelope protein gene regions (mean ± SD: 5.04 ± 0.32. Further analysis found that positive selection pressure of DENV-3 may occur in the non-structural protein 1 gene region and the positive selection site was detected at position 178 of the NS1 gene. Conclusion Our study confirmed that the envelope protein is under purifying selection pressure although it presented higher sequence diversity. The detection of positive selection pressure in the non-structural protein along genotype II indicated that DENV-3 originated from Southeast Asia needs to monitor the emergence of DENV strains with epidemic potential for better epidemic prevention and vaccine development.

  8. Analysis of the temporal evolution of total column nitrogen dioxide ...

    African Journals Online (AJOL)

    Concurrent measurement and analysis of Nitrogen dioxide (NO2)and Ozone (O3) are essential for improved understanding of ozone distribution. This study sought to analyse the temporal evolution of total column NO2 and O3 over Nairobi using satellite-derived daily data between 2009 and 2013. Seasonality is observed ...

  9. Evolution and expression analysis of the soybean glutamate ...

    Indian Academy of Sciences (India)

    Evolution and expression analysis of the soybean glutamate decarboxylase gene family. TAE KYUNG HYUN, SEUNG HEE EOM, XIAO HAN and JU-SUNG KIM http://www.ias.ac.in/jbiosci. J. Biosci. 39(5), December 2014, 899–907, © Indian Academy of Sciences. Supplementary material. Supplementary figure 1.

  10. Molecular characterization of Giardia psittaci by multilocus sequence analysis.

    Science.gov (United States)

    Abe, Niichiro; Makino, Ikuko; Kojima, Atsushi

    2012-12-01

    Multilocus sequence analyses targeting small subunit ribosomal DNA (SSU rDNA), elongation factor 1 alpha (ef1α), glutamate dehydrogenase (gdh), and beta giardin (β-giardin) were performed on Giardia psittaci isolates from three Budgerigars (Melopsittacus undulates) and four Barred parakeets (Bolborhynchus lineola) kept in individual households or imported from overseas. Nucleotide differences and phylogenetic analyses at four loci indicate the distinction of G. psittaci from the other known Giardia species: Giardia muris, Giardia microti, Giardia ardeae, and Giardia duodenalis assemblages. Furthermore, G. psittaci was related more closely to G. duodenalis than to the other known Giardia species, except for G. microti. Conflicting signals regarded as "double peaks" were found at the same nucleotide positions of the ef1α in all isolates. However, the sequences of the other three loci, including gdh and β-giardin, which are known to be highly variable, from all isolates were also mutually identical at every locus. They showed no double peaks. These results suggest that double peaks found in the ef1α sequences are caused not by mixed infection with genetically different G. psittaci isolates but by allelic sequence heterogeneity (ASH), which is observed in diplomonad lineages including G. duodenalis. No sequence difference was found in any G. psittaci isolates at the gdh and β-giardin, suggesting that G. psittaci is indeed not more diverse genetically than other Giardia species. This report is the first to provide evidence related to the genetic characteristics of G. psittaci obtained using multilocus sequence analysis. Copyright © 2012 Elsevier B.V. All rights reserved.

  11. A 5.8S nuclear ribosomal RNA gene sequence database: applications to ecology and evolution

    Science.gov (United States)

    Cullings, K. W.; Vogler, D. R.

    1998-01-01

    We complied a 5.8S nuclear ribosomal gene sequence database for animals, plants, and fungi using both newly generated and GenBank sequences. We demonstrate the utility of this database as an internal check to determine whether the target organism and not a contaminant has been sequenced, as a diagnostic tool for ecologists and evolutionary biologists to determine the placement of asexual fungi within larger taxonomic groups, and as a tool to help identify fungi that form ectomycorrhizae.

  12. The DNA sequence and analysis of human chromosome 13

    OpenAIRE

    Dunham, A.; Matthews, L. H.; Burton, J.; Ashurst, J. L.; Howe, K. L.; Ashcroft, K. J.; Beare, D. M.; Burford, D. C.; Hunt, S. E.; Griffiths-Jones, S.; Jones, M. C.; Keenan, S. J.; Oliver, K.; Scott, C. E.; Ainscough, R.

    2004-01-01

    Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the prot...

  13. Frame to Frame Diffeomorphic Motion Analysis from Echocardiographic Sequences

    OpenAIRE

    Zhang, Zhijun; Sahn, David; Song, Xubo

    2011-01-01

    International audience; Quantitative motion analysis from echocardiography is an important yet challenging problem. We develop a motion estimation algorithm for echocardiographic image sequences based on diffeomorphic image registration in which the velocity field is spatiotemporally smooth. The novelty of this work is that instead of optimizing a functional of velocity field which consists of similarity metrics between a reference image to each of the following images (\\textitfirst-to-follow...

  14. CAFE: aCcelerated Alignment-FrEe sequence analysis.

    Science.gov (United States)

    Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A; Waterman, Michael S; Sun, Fengzhu

    2017-07-03

    Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Analysis of surge line MBLOCA sequences with HPSI failed

    Energy Technology Data Exchange (ETDEWEB)

    Queral, C.; Gonzalez-Cadelo, J.; Prada, D.; Montero, J. [Univ. Politecnica de Madrid, Madrid (Spain); Martinez-Murillo, J.C. [Almaraz-Trillo (United Arab Emirates); Perez, J. [Consejo de Seguridad Nuclear, Madrid (Spain)

    2011-07-01

    The main objective of OECD/NEA ROSA-2 Project is to analyze thermal-hydraulic issues relevant to light water reactors (LWR) safety by using Large Scale Test Facility (LSTF) . As a part of this project ROSA-2 test1 has been performed in LSTF during 2009. This test consists of a double ended guillotine break in the surge line, surge line loss-of-coolant accident (SL-LOCA), with high pressure safety injection, (HPSI) failed and reactor coolant pumps trip simultaneously to reactor scram. As part of the participation of Universidad Politecnica de Madrid (UPM) group in OECD/NEA ROSA-2 and Code Applications and Maintenance Program (CAMP) projects two tasks related with this test have been performed. In the first task the simulation of ROSA-2 test1 has been performed with TRAC/RELAP Advanced Computational Engine (TRACE) code. Furthermore, an analysis of similar sequences in a Westinghouse pressurized water reactor (PWR) has been carried out; within this analysis a wide range of break area size has been analyzed: from 0.0254 m to 0.2794 m (1 to 11 inches), and a sensitivity analysis of delay time in the beginning of manual depressurization in the secondary side has been performed. The results show that the sequences with intermediate break sizes, from 0.0508 m to 0.1016 m (2 to 4 inches) have worse consequences in this kind of sequences. (author)

  16. Molecular phylogeny, population genetics, and evolution of heterocystous cyanobacteria using nifH gene sequences

    Czech Academy of Sciences Publication Activity Database

    Singh, P.; Singh, S. S.; Elster, Josef; Mishra, A. K.

    2013-01-01

    Roč. 250, č. 3 (2013), s. 751-764 ISSN 0033-183X Institutional support: RVO:67985939 Keywords : evolution * heterocystous cyanobacteria * nifH gene Subject RIV: EH - Ecology, Behaviour Impact factor: 3.171, year: 2013

  17. Multilocus sequence analysis of nectar pseudomonads reveals high genetic diversity and contrasting recombination patterns.

    Directory of Open Access Journals (Sweden)

    Sergio Alvarez-Pérez

    Full Text Available The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas 'sensu stricto' isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA of four core housekeeping genes (rrs, gyrB, rpoB and rpoD. A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1; P. fluorescens, P. lutea and P. syringae (NG 2; and P. rhizosphaerae (NG 3. Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria.

  18. Complete sequence and detailed analysis of the first indigenous plasmid from Xanthomonas oryzae pv. oryzicola.

    Science.gov (United States)

    Niu, Xiang-Na; Wei, Zhi-Qiong; Zou, Hai-Fan; Xie, Gui-Gang; Wu, Feng; Li, Kang-Jia; Jiang, Wei; Tang, Ji-Liang; He, Yong-Qiang

    2015-10-24

    Bacterial plasmids have a major impact on metabolic function and adaptation of their hosts. An indigenous plasmid was identified in a Chinese isolate (GX01) of the invasive phytopathogen Xanthomonas oryzae pv. oryzicola (Xoc), the causal agent of rice bacterial leaf streak (BLS). To elucidate the biological functions of the plasmid, we have sequenced and comprehensively annotated the plasmid. The plasmid DNA was extracted from Xoc strain GX01 by alkaline lysis and digested with restriction enzymes. The cloned and subcloned DNA fragments in pUC19 were sequenced by Sanger sequencing. Sequences were assembled by using Sequencher software. Gaps were closed by primer walking and sequencing, and multi-PCRs were conducted through the whole plasmid sequence for verification. BLAST, phylogenetic analysis and dinucleotide calculation were performed for gene annotation and DNA structure analysis. Transformation, transconjugation and stress tolerance tests were carried out for plasmid function assays. The indigenous plasmid from Xoc strain GX01, designated pXOCgx01, is 53,206-bp long and has been annotated to possess 64 open reading frames (ORFs), including genes encoding type IV secretion system, heavy metal exporter, plasmid stability factors, and DNA mobile factors, i.e., the Tn3-like transposon. Bioinformatics analysis showed that pXOCgx01 has a mosaic structure containing different genome contexts with distinct genomic heterogeneities. Phylogenetic analysis indicated that the closest relative of pXOCgx01 is pXAC64 from Xanthomonas axonopodis pv. citri str. 306. It was estimated that there are four copies of pXOCgx01 per cell of Xoc GX01 by PCR assay and the calculation of whole genome shotgun sequencing data. We demonstrate that pXOCgx01 is a self-transmissible plasmid and can replicate in some Xanthomonas spp. strains, but not in Escherichia coli DH5α. It could significantly enhance the tolerance of Xanthomonas oryzae pv. oryzae PXO99A to the stresses of heavy metal

  19. The Design and Analysis of Transposon-Insertion Sequencing Experiments

    Science.gov (United States)

    Chao, Michael C.; Abel, Sören; Davis, Brigid M.; Waldor, Matthew K.

    2016-01-01

    Preface Transposon-insertion sequencing (TIS) is a powerful approach that can be widely applied to genome-wide definition of loci that are required for growth in diverse conditions. However, experimental design choices and stochastic biological processes can heavily influence the results of TIS experiments and affect downstream statistical analysis. Here, we discuss TIS experimental parameters and how these factors relate to the benefits and limitations of the various statistical frameworks that can be applied to computational analysis of TIS data. PMID:26775926

  20. Convergent Evolution of Hemoglobin Function in High-Altitude Andean Waterfowl Involves Limited Parallelism at the Molecular Sequence Level.

    Science.gov (United States)

    Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Moriyama, Hideaki; Weber, Roy E; Muñoz-Fuentes, Violeta; Green, Andy J; Kopuchian, Cecilia; Tubaro, Pablo L; Alza, Luis; Bulgarella, Mariana; Smith, Matthew M; Wilson, Robert E; Fago, Angela; McCracken, Kevin G; Storz, Jay F

    2015-12-01

    A fundamental question in evolutionary genetics concerns the extent to which adaptive phenotypic convergence is attributable to convergent or parallel changes at the molecular sequence level. Here we report a comparative analysis of hemoglobin (Hb) function in eight phylogenetically replicated pairs of high- and low-altitude waterfowl taxa to test for convergence in the oxygenation properties of Hb, and to assess the extent to which convergence in biochemical phenotype is attributable to repeated amino acid replacements. Functional experiments on native Hb variants and protein engineering experiments based on site-directed mutagenesis revealed the phenotypic effects of specific amino acid replacements that were responsible for convergent increases in Hb-O2 affinity in multiple high-altitude taxa. In six of the eight taxon pairs, high-altitude taxa evolved derived increases in Hb-O2 affinity that were caused by a combination of unique replacements, parallel replacements (involving identical-by-state variants with independent mutational origins in different lineages), and collateral replacements (involving shared, identical-by-descent variants derived via introgressive hybridization). In genome scans of nucleotide differentiation involving high- and low-altitude populations of three separate species, function-altering amino acid polymorphisms in the globin genes emerged as highly significant outliers, providing independent evidence for adaptive divergence in Hb function. The experimental results demonstrate that convergent changes in protein function can occur through multiple historical paths, and can involve multiple possible mutations. Most cases of convergence in Hb function did not involve parallel substitutions and most parallel substitutions did not affect Hb-O2 affinity, indicating that the repeatability of phenotypic evolution does not require parallelism at the molecular level.

  1. Genome sequence of the Brown Norway rat yields insights into mammalian evolution

    DEFF Research Database (Denmark)

    Gibbs, Richard A; Weinstock, George M; Metzker, Michael L

    2004-01-01

    The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering ove...

  2. Effect of accretion on the pre-main-sequence evolution of low-mass stars and brown dwarfs

    Science.gov (United States)

    Vorobyov, Eduard I.; Elbakyan, Vardan; Hosokawa, Takashi; Sakurai, Yuya; Guedel, Manuel; Yorke, Harold

    2017-09-01

    Aims: The pre-main-sequence evolution of low-mass stars and brown dwarfs is studied numerically starting from the formation of a protostellar or proto-brown dwarf seed and taking into account the mass accretion onto the central object during the initial several Myr of evolution. Methods: The stellar evolution was computed using the STELLAR evolution code with recent modifications. The mass accretion rates were taken from numerical hydrodynamics models by computing the circumstellar disk evolution starting from the gravitational collapse of prestellar cloud cores of various mass and angular momentum. The resulting stellar evolution tracks were compared with the isochrones and isomasses calculated using non-accreting models. Results: We find that mass accretion in the initial several Myr of protostellar evolution can have a strong effect on the subsequent evolution of young stars and brown dwarfs. The disagreement between accreting and non-accreting models in terms of the total stellar luminosity L∗, stellar radius R∗, and effective temperature Teff depends on the thermal efficiency of accretion, that is, on the fraction of accretion energy that is absorbed by the central object. The largest mismatch is found for the cold accretion case, in which essentially all accretion energy is radiated away. The relative deviations in L∗ and R∗ in this case can reach 50% for objects 1.0 Myr old, and they remain notable even for objects 10 Myr old. In the hot and hybrid accretion cases, in which a constant fraction of accretion energy is absorbed, the disagreement between accreting and non-accreting models becomes less pronounced, but still remains notable for objects 1.0 Myr old. These disagreements may lead to an incorrect age estimate for objects of (sub-)solar mass when using the isochrones that are based on non-accreting models, as has also been noted previously. We find that objects with strong luminosity bursts exhibit notable excursions in the L∗-Teff diagram

  3. Hardware Accelerator for the Multifractal Analysis of DNA Sequences.

    Science.gov (United States)

    Duarte-Sanchez, Jorge E; Velasco-Medina, Jaime; Moreno, Pedro A

    2017-07-24

    The multifractal analysis has allowed to quantify the genetic variability and non-linear stability along the human genome sequence. It has some implications in explaining several genetic diseases given by some chromosome abnormalities, among other genetic particularities. The multifractal analysis of a genome is carried out by dividing the complete DNA sequence in smaller fragments and calculating the generalized dimension spectrum of each fragment using the chaos game representation and the box-counting method. This is a time consuming process because it involves the processing of large data sets using floating-point representation. In order to reduce the computation time, we designed an application-specific processor, here called multifractal processor, which is based on our proposed hardware-oriented algorithm for calculating efficiently the generalized dimension spectrum of DNA sequences. The multifractal processor was implemented on a low-cost SoC-FPGA and was verified by processing a complete human genome. The execution time and numeric results of the Multifractal processor were compared with the results obtained from the software implementation executed in a 20-core workstation, achieving a speed up of 2.6x and an average error of 0.0003%.

  4. Targeted DNA methylation analysis by next-generation sequencing.

    Science.gov (United States)

    Masser, Dustin R; Stanford, David R; Freeman, Willard M

    2015-02-24

    The role of epigenetic processes in the control of gene expression has been known for a number of years. DNA methylation at cytosine residues is of particular interest for epigenetic studies as it has been demonstrated to be both a long lasting and a dynamic regulator of gene expression. Efforts to examine epigenetic changes in health and disease have been hindered by the lack of high-throughput, quantitatively accurate methods. With the advent and popularization of next-generation sequencing (NGS) technologies, these tools are now being applied to epigenomics in addition to existing genomic and transcriptomic methodologies. For epigenetic investigations of cytosine methylation where regions of interest, such as specific gene promoters or CpG islands, have been identified and there is a need to examine significant numbers of samples with high quantitative accuracy, we have developed a method called Bisulfite Amplicon Sequencing (BSAS). This method combines bisulfite conversion with targeted amplification of regions of interest, transposome-mediated library construction and benchtop NGS. BSAS offers a rapid and efficient method for analysis of up to 10 kb of targeted regions in up to 96 samples at a time that can be performed by most research groups with basic molecular biology skills. The results provide absolute quantitation of cytosine methylation with base specificity. BSAS can be applied to any genomic region from any DNA source. This method is useful for hypothesis testing studies of target regions of interest as well as confirmation of regions identified in genome-wide methylation analyses such as whole genome bisulfite sequencing, reduced representation bisulfite sequencing, and methylated DNA immunoprecipitation sequencing.

  5. Genome sequence and comparative analysis of a putative entomopathogenic Serratia isolated from Caenorhabditis briggsae.

    Science.gov (United States)

    Abebe-Akele, Feseha; Tisa, Louis S; Cooper, Vaughn S; Hatcher, Philip J; Abebe, Eyualem; Thomas, W Kelley

    2015-07-18

    Entomopathogenic associations between nematodes in the genera Steinernema and Heterorhabdus with their cognate bacteria from the bacterial genera Xenorhabdus and Photorhabdus, respectively, are extensively studied for their potential as biological control agents against invasive insect species. These two highly coevolved associations were results of convergent evolution. Given the natural abundance of bacteria, nematodes and insects, it is surprising that only these two associations with no intermediate forms are widely studied in the entomopathogenic context. Discovering analogous systems involving novel bacterial and nematode species would shed light on the evolutionary processes involved in the transition from free living organisms to obligatory partners in entomopathogenicity. We report the complete genome sequence of a new member of the enterobacterial genus Serratia that forms a putative entomopathogenic complex with Caenorhabditis briggsae. Analysis of the 5.04 MB chromosomal genome predicts 4599 protein coding genes, seven sets of ribosomal RNA genes, 84 tRNA genes and a 64.8 KB plasmid encoding 74 genes. Comparative genomic analysis with three of the previously sequenced Serratia species, S. marcescens DB11 and S. proteamaculans 568, and Serratia sp. AS12, revealed that these four representatives of the genus share a core set of ~3100 genes and extensive structural conservation. The newly identified species shares a more recent common ancestor with S. marcescens with 99% sequence identity in rDNA sequence and orthology across 85.6% of predicted genes. Of the 39 genes/operons implicated in the virulence, symbiosis, recolonization, immune evasion and bioconversion, 21 (53.8%) were present in Serratia while 33 (84.6%) and 35 (89%) were present in Xenorhabdus and Photorhabdus EPN bacteria respectively. The majority of unique sequences in Serratia sp. SCBI (South African Caenorhabditis briggsae Isolate) are found in ~29 genomic islands of 5 to 65 genes and are

  6. Ensemble analysis of adaptive compressed genome sequencing strategies

    Science.gov (United States)

    2014-01-01

    Background Acquiring genomes at single-cell resolution has many applications such as in the study of microbiota. However, deep sequencing and assembly of all of millions of cells in a sample is prohibitively costly. A property that can come to rescue is that deep sequencing of every cell should not be necessary to capture all distinct genomes, as the majority of cells are biological replicates. Biologically important samples are often sparse in that sense. In this paper, we propose an adaptive compressed method, also known as distilled sensing, to capture all distinct genomes in a sparse microbial community with reduced sequencing effort. As opposed to group testing in which the number of distinct events is often constant and sparsity is equivalent to rarity of an event, sparsity in our case means scarcity of distinct events in comparison to the data size. Previously, we introduced the problem and proposed a distilled sensing solution based on the breadth first search strategy. We simulated the whole process which constrained our ability to study the behavior of the algorithm for the entire ensemble due to its computational intensity. Results In this paper, we modify our previous breadth first search strategy and introduce the depth first search strategy. Instead of simulating the entire process, which is intractable for a large number of experiments, we provide a dynamic programming algorithm to analyze the behavior of the method for the entire ensemble. The ensemble analysis algorithm recursively calculates the probability of capturing every distinct genome and also the expected total sequenced nucleotides for a given population profile. Our results suggest that the expected total sequenced nucleotides grows proportional to log of the number of cells and proportional linearly with the number of distinct genomes. The probability of missing a genome depends on its abundance and the ratio of its size over the maximum genome size in the sample. The modified resource

  7. Sequence Comparisons of Odorant Receptors among Tortricid Moths Reveal Different Rates of Molecular Evolution among Family Members

    Science.gov (United States)

    Carraher, Colm; Authier, Astrid; Steinwender, Bernd; Newcomb, Richard D.

    2012-01-01

    In insects, odorant receptors detect volatile cues involved in behaviours such as mate recognition, food location and oviposition. We have investigated the evolution of three odorant receptors from five species within the moth genera Ctenopseustis and Planotrotrix, family Tortricidae, which fall into distinct clades within the odorant receptor multigene family. One receptor is the orthologue of the co-receptor Or83b, now known as Orco (OR2), and encodes the obligate ion channel subunit of the receptor complex. In comparison, the other two receptors, OR1 and OR3, are ligand-binding receptor subunits, activated by volatile compounds produced by plants - methyl salicylate and citral, respectively. Rates of sequence evolution at non-synonymous sites were significantly higher in OR1 compared with OR2 and OR3. Within the dataset OR1 contains 109 variable amino acid positions that are distributed evenly across the entire protein including transmembrane helices, loop regions and termini, while OR2 and OR3 contain 18 and 16 variable sites, respectively. OR2 shows a high level of amino acid conservation as expected due to its essential role in odour detection; however we found unexpected differences in the rate of evolution between two ligand-binding odorant receptors, OR1 and OR3. OR3 shows high sequence conservation suggestive of a conserved role in odour reception, whereas the higher rate of evolution observed in OR1, particularly at non-synonymous sites, may be suggestive of relaxed constraint, perhaps associated with the loss of an ancestral role in sex pheromone reception. PMID:22701634

  8. Sequence comparisons of odorant receptors among tortricid moths reveal different rates of molecular evolution among family members.

    Directory of Open Access Journals (Sweden)

    Colm Carraher

    Full Text Available In insects, odorant receptors detect volatile cues involved in behaviours such as mate recognition, food location and oviposition. We have investigated the evolution of three odorant receptors from five species within the moth genera Ctenopseustis and Planotrotrix, family Tortricidae, which fall into distinct clades within the odorant receptor multigene family. One receptor is the orthologue of the co-receptor Or83b, now known as Orco (OR2, and encodes the obligate ion channel subunit of the receptor complex. In comparison, the other two receptors, OR1 and OR3, are ligand-binding receptor subunits, activated by volatile compounds produced by plants--methyl salicylate and citral, respectively. Rates of sequence evolution at non-synonymous sites were significantly higher in OR1 compared with OR2 and OR3. Within the dataset OR1 contains 109 variable amino acid positions that are distributed evenly across the entire protein including transmembrane helices, loop regions and termini, while OR2 and OR3 contain 18 and 16 variable sites, respectively. OR2 shows a high level of amino acid conservation as expected due to its essential role in odour detection; however we found unexpected differences in the rate of evolution between two ligand-binding odorant receptors, OR1 and OR3. OR3 shows high sequence conservation suggestive of a conserved role in odour reception, whereas the higher rate of evolution observed in OR1, particularly at non-synonymous sites, may be suggestive of relaxed constraint, perhaps associated with the loss of an ancestral role in sex pheromone reception.

  9. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    Science.gov (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  10. Genome sequence analysis of solanum lycopersicum showing the phylogenetic relationship based on multiple sequence alignment and conserved domain proteins.

    OpenAIRE

    Uma kumari; Ashok kumar choudhary

    2016-01-01

    Phylogenetics analysis has become essential in researching the evolutionary relationship between sequence alignment and conserved domain protein evolutionary relationship are identified from open reading frame rather than from complete sequences.A reading frame is a set of consecutive,nucleotide ,non overlapping triplets of three consecutive nucleotide .The national center for biotechnology information NCBI provide many tools for compairing database- stored nucleotide or protein sequence,i...

  11. Sequence Analysis of SSR-Flanking Regions Identifies Genome Affinities between Pasture Grass Fungal Endophyte Taxa

    Directory of Open Access Journals (Sweden)

    Eline van Zijll de Jong

    2011-01-01

    Full Text Available Fungal species of the Neotyphodium and Epichloë genera are endophytes of pasture grasses showing complex differences of life-cycle and genetic architecture. Simple sequence repeat (SSR markers have been developed from endophyte-derived expressed sequence tag (EST collections. Although SSR array size polymorphisms are appropriate for phenetic analysis to distinguish between taxa, the capacity to resolve phylogenetic relationships is limited by both homoplasy and heteroploidy effects. In contrast, nonrepetitive sequence regions that flank SSRs have been effectively implemented in this study to demonstrate a common evolutionary origin of grass fungal endophytes. Consistent patterns of relationships between specific taxa were apparent across multiple target loci, confirming previous studies of genome evolution based on variation of individual genes. Evidence was obtained for the definition of endophyte taxa not only through genomic affinities but also by relative gene content. Results were compatible with the current view that some asexual Neotyphodium species arose following interspecific hybridisation between sexual Epichloë ancestors. Phylogenetic analysis of SSR-flanking regions, in combination with the results of previous studies with other EST-derived SSR markers, further permitted characterisation of Neotyphodium isolates that could not be assigned to known taxa on the basis of morphological characteristics.

  12. De novo Ixodes ricinus salivary gland transcriptome analysis using two next-generation sequencing methodologies

    Science.gov (United States)

    Schwarz, Alexandra; von Reumont, Björn M.; Erhart, Jan; Chagas, Andrezza C.; Ribeiro, José M. C.; Kotsyfakis, Michalis

    2013-01-01

    Tick salivary gland (SG) proteins possess powerful pharmacologic properties that facilitate tick feeding and pathogen transmission. For the first time, SG transcriptomes of Ixodes ricinus, an important disease vector for humans and animals, were analyzed using next-generation sequencing. SGs were collected from different tick life stages fed on various animal species, including cofeeding of nymphs and adults on the same host. Four cDNA samples were sequenced, discriminating tick SG transcriptomes of early- and late-feeding nymphs or adults. In total, 441,381,454 pyrosequencing reads and 67,703,183 Illumina reads were assembled into 272,220 contigs, of which 34,560 extensively annotated coding sequences are disclosed; 8686 coding sequences were submitted to GenBank. Overall, 13% of contigs were classified as secreted proteins that showed significant differences in the transcript representation among the 4 SG samples, including high numbers of sample-specific transcripts. Detailed phylogenetic reconstructions of two relatively abundant SG-secreted protein families demonstrated how this study improves our understanding of the molecular evolution of hematophagy in arthropods. Our data significantly increase the available genomic information for I. ricinus and form a solid basis for future tick genome/transcriptome assemblies and the functional analysis of effectors that mediate the feeding physiology and parasite-vector interaction of I. ricinus.—Schwarz, A., von Reumont, B.M., Erhart, J., Chagas, A.C., Ribeiro, J.M.C., Kotsyfakis, M. De novo Ixodes ricinus salivary gland transcriptome analysis using two next-generation sequencing methodologies. PMID:23964076

  13. Sequence and structural evolution of the KsgA/Dim1 methyltransferase family

    Directory of Open Access Journals (Sweden)

    Rife Jason P

    2008-10-01

    Full Text Available Abstract Background One of the 60 or so genes conserved in all domains of life is the ksgA/dim1 orthologous group. Enzymes from this family perform the same post-transcriptional nucleotide modification in ribosome biogenesis, irrespective of organism. Despite this common function, divergence has enabled some family members to adopt new and sometimes radically different functions. For example, in S. cerevisiae Dim1 performs two distinct functions in ribosome biogenesis, while human mtTFB is not only an rRNA methyltransferase in the mitochondria but also a mitochondrial transcription factor. Thus, these proteins offer an unprecedented opportunity to study evolutionary aspects of structure/function relationships, especially with respect to our recently published work on the binding mode of a KsgA family member to its 30S subunit substrate. Here we compare and contrast KsgA orthologs from bacteria, eukaryotes, and mitochondria as well as the paralogous ErmC enzyme. Results By using structure and sequence comparisons in concert with a unified ribosome binding model, we have identified regions of the orthologs that are likely related to gains of function beyond the common methyltransferase function. There are core regions common to the entire enzyme class that are associated with ribosome binding, an event required in rRNA methylation activity, and regions that are conserved in subgroups that are presumably related to non-methyltransferase functions. Conclusion The ancient protein KsgA/Dim1 has adapted to cellular roles beyond that of merely an rRNA methyltransferase. These results provide a structural foundation for analysis of multiple aspects of ribosome biogenesis and mitochondrial transcription.

  14. Assembly and comparative analysis of complete mitochondrial genome sequence of an economic plant Salix suchowensis

    Directory of Open Access Journals (Sweden)

    Ning Ye

    2017-03-01

    Full Text Available Willow is a widely used dioecious woody plant of Salicaceae family in China. Due to their high biomass yields, willows are promising sources for bioenergy crops. In this study, we assembled the complete mitochondrial (mt genome sequence of S. suchowensis with the length of 644,437 bp using Roche-454 GS FLX Titanium sequencing technologies. Base composition of the S. suchowensis mt genome is A (27.43%, T (27.59%, C (22.34%, and G (22.64%, which shows a prevalent GC content with that of other angiosperms. This long circular mt genome encodes 58 unique genes (32 protein-coding genes, 23 tRNA genes and 3 rRNA genes, and 9 of the 32 protein-coding genes contain 17 introns. Through the phylogenetic analysis of 35 species based on 23 protein-coding genes, it is supported that Salix as a sister to Populus. With the detailed phylogenetic information and the identification of phylogenetic position, some ribosomal protein genes and succinate dehydrogenase genes are found usually lost during evolution. As a native shrub willow species, this worthwhile research of S. suchowensis mt genome will provide more desirable information for better understanding the genomic breeding and missing pieces of sex determination evolution in the future.

  15. Illuminating the evolution of equids and rodents with next-generation sequencing of ancient specimens

    DEFF Research Database (Denmark)

    Mouatt, Julia Thidamarth Vilstrup

    The sequencing of ancient DNA provides perspectives on the genetic history of past populations and extinct species. However, ancient DNA research presents specific limitations mostly due to DNA survival, damage and contamination. Yet with stringent laboratory procedures, the sensitivity of target...... enrichment methods and the massive throughput and latest advances within DNA sequencing, the field of ancient DNA has flourished in later years. Those advances have even enabled the sequencing of complete genomes from the past, moving the field into genomic sciences. In this thesis we have used these latest...... developments within ancient DNA research, including target enrichment capture and Next-Generation Sequencing, to address a range of evolutionary questions related to two major mammalian groups, equids and rodents. In particular we have resolved phylogenetic relationships within equids using complete mitochond...

  16. Mulan: multiple-sequence local alignment and visualization for studying function and evolution

    National Research Council Canada - National Science Library

    Ovcharenko, Ivan; Loots, Gabriela G; Giardine, Belinda M; Hou, Minmei; Ma, Jian; Hardison, Ross C; Stubbs, Lisa; Miller, Webb

    2005-01-01

    .... Here we introduce Mulan (http://mulan.dcode.org/), a novel method and a network server for comparing multiple draft and finished-quality sequences to identify functional elements conserved over evolutionary time...

  17. Evolution of germline-limited sequences in two populations of the ciliate Chilodonella uncinata.

    Science.gov (United States)

    Zufall, Rebecca A; Sturm, Mariel; Mahon, Brian C

    2012-04-01

    Ciliates are microbial eukaryotes that separate their nuclear functions into a germline micronucleus and a somatic macronucleus. During development of the macronucleus the genome undergoes a series of reorganization events that includes the precise excision of intervening DNA. Here, we determine the architecture of four loci in the micronuclear and macronuclear genomes of the ciliate Chilodonella uncinata and compare the levels of variation in micronuclear-limited sequences to macronuclear destined sequences at two of these loci. We find that within a population, germline-limited sequences are evolving at the same rate as other putatively neutral sites, but between populations germline-limited sequences are accumulating mutations at a much faster rate than other sites. We also find evidence of macronuclear recombination and incomplete elimination of intervening DNA, which result in increased diversity in the macronuclear genome. Our results support the assertion that the unusual genomic features of ciliates can result in rapid and unpredicted patterns of diversification.

  18. Novel arenavirus sequences in Hylomyscus sp. and Mus (Nannomys setulosus from Côte d'Ivoire: implications for evolution of arenaviruses in Africa.

    Directory of Open Access Journals (Sweden)

    David Coulibaly-N'Golo

    Full Text Available This study aimed to identify new arenaviruses and gather insights in the evolution of arenaviruses in Africa. During 2003 through 2005, 1,228 small mammals representing 14 different genera were trapped in 9 villages in south, east, and middle west of Côte d'Ivoire. Specimens were screened by pan-Old World arenavirus RT-PCRs targeting S and L RNA segments as well as immunofluorescence assay. Sequences of two novel tentative species of the family Arenaviridae, Menekre and Gbagroube virus, were detected in Hylomyscus sp. and Mus (Nannomys setulosus, respectively. Arenavirus infection of Mus (Nannomys setulosus was also demonstrated by serological testing. Lassa virus was not found, although 60% of the captured animals were Mastomys natalensis. Complete S RNA and partial L RNA sequences of the novel viruses were recovered from the rodent specimens and subjected to phylogenetic analysis. Gbagroube virus is a closely related sister taxon of Lassa virus, while Menekre virus clusters with the Ippy/Mobala/Mopeia virus complex. Reconstruction of possible virus-host co-phylogeny scenarios suggests that, within the African continent, signatures of co-evolution might have been obliterated by multiple host-switching events.

  19. Evolution of analysis of polyhenols from grapes, wines, and extracts.

    Science.gov (United States)

    Lorrain, Bénédicte; Ky, Isabelle; Pechamat, Laurent; Teissedre, Pierre-Louis

    2013-01-16

    Grape and wine phenolics are structurally diverse, from simple molecules to oligomers and polymers usually designated as tannins. They have an important impact on the organoleptic properties of wines, that's why their analysis and quantification are of primordial importance. The extraction of phenolics from grapes and from wines is the first step involved in the analysis. Then, several analytical methods have been developed for the determination of total content of phenolic, while chromatographic and spectrophotometric analyses are continuously improved in order to achieve adequate separation of phenolic molecules, their subsequent identification and quantification. This review provides a summary of evolution of analysis of polyphenols from grapes, wines and extracts.

  20. Monophyly of clade III nematodes is not supported by phylogenetic analysis of complete mitochondrial genome sequences

    Directory of Open Access Journals (Sweden)

    Min Gi-Sik

    2011-08-01

    a distinct clade, however, in one case Oxyurida is sister to Spirurida. Ascaridida, Oxyurida, and Spirurida (the sampled clade III taxa do not form a monophyletic group based on complete mitochondrial DNA sequences. Tree topology tests revealed that constraining clade III taxa to be monophyletic, given the mtDNA datasets analyzed, was a significantly worse result. Conclusion The phylogenetic hypotheses from comparative analysis of the complete mitochondrial genome data (analysis of nucleotide and amino acid datasets, and nucleotide data excluding 3rd positions indicates that nematodes representing Ascaridida, Oxyurida and Spirurida do not share an exclusive most recent common ancestor, in contrast to published results based on nuclear ribosomal DNA. Overall, mtDNA genome data provides reliable support for nematode relationships that often corroborates findings based on nuclear rDNA. It is anticipated that additional taxonomic sampling will provide a wealth of information on mitochondrial genome evolution and sequence data for developing phylogenetic hypotheses for the phylum Nematoda.

  1. Monophyly of clade III nematodes is not supported by phylogenetic analysis of complete mitochondrial genome sequences

    Science.gov (United States)

    2011-01-01

    , however, in one case Oxyurida is sister to Spirurida. Ascaridida, Oxyurida, and Spirurida (the sampled clade III taxa) do not form a monophyletic group based on complete mitochondrial DNA sequences. Tree topology tests revealed that constraining clade III taxa to be monophyletic, given the mtDNA datasets analyzed, was a significantly worse result. Conclusion The phylogenetic hypotheses from comparative analysis of the complete mitochondrial genome data (analysis of nucleotide and amino acid datasets, and nucleotide data excluding 3rd positions) indicates that nematodes representing Ascaridida, Oxyurida and Spirurida do not share an exclusive most recent common ancestor, in contrast to published results based on nuclear ribosomal DNA. Overall, mtDNA genome data provides reliable support for nematode relationships that often corroborates findings based on nuclear rDNA. It is anticipated that additional taxonomic sampling will provide a wealth of information on mitochondrial genome evolution and sequence data for developing phylogenetic hypotheses for the phylum Nematoda. PMID:21813000

  2. Evolution and comparative analysis of the MHC Class III inflammatory region

    Directory of Open Access Journals (Sweden)

    Speed Terence P

    2006-11-01

    Full Text Available Abstract Background The Major Histocompatibility Complex (MHC is essential for immune function. Historically, it has been subdivided into three regions (Class I, II, and III, but a cluster of functionally related genes within the Class III region has also been referred to as the Class IV region or "inflammatory region". This group of genes is involved in the inflammatory response, and includes members of the tumour necrosis family. Here we report the sequencing, annotation and comparative analysis of a tammar wallaby BAC containing the inflammatory region. We also discuss the extent of sequence conservation across the entire region and identify elements conserved in evolution. Results Fourteen Class III genes from the tammar wallaby inflammatory region were characterised and compared to their orthologues in other vertebrates. The organisation and sequence of genes in the inflammatory region of both the wallaby and South American opossum are highly conserved compared to known genes from eutherian ("placental" mammals. Some minor differences separate the two marsupial species. Eight genes within the inflammatory region have remained tightly clustered for at least 360 million years, predating the divergence of the amphibian lineage. Analysis of sequence conservation identified 354 elements that are conserved. These range in size from 7 to 431 bases and cover 15.6% of the inflammatory region, representing approximately a 4-fold increase compared to the average for vertebrate genomes. About 5.5% of this conserved sequence is marsupial-specific, including three cases of marsupial-specific repeats. Highly Conserved Elements were also characterised. Conclusion Using comparative analysis, we show that a cluster of MHC genes involved in inflammation, including TNF, LTA (or its putative teleost homolog TNF-N, APOM, and BAT3 have remained together for over 450 million years, predating the divergence of mammals from fish. The observed enrichment in conserved

  3. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit

    2014-02-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  4. Lytic enzyme discovery through multigenomic sequence analysis in Clostridium perfringens.

    Science.gov (United States)

    Schmitz, Jonathan E; Ossiprandi, Maria Cristina; Rumah, Kareem R; Fischetti, Vincent A

    2011-03-01

    With their ability to lyse Gram-positive bacteria, phage lytic enzymes (or lysins) have received a great deal of attention as novel anti-infective agents. The number of known genes encoding these peptidoglycan hydrolases has increased markedly in recent years, due in large part to advances in DNA sequencing technology. As the genomes of more and more bacterial species/strains are sequenced, lysin-encoding open reading frames (ORFs) can be readily identified in lysogenized prophage regions. In the current study, we sought to assess lysin diversity for the medically relevant pathogen Clostridium perfringens. The sequenced genomes of nine C. perfringens strains were computationally mined for prophage lysins and lysin-like ORFs, revealing several dozen proteins of various enzymatic classes. Of these lysins, a muramidase from strain ATCC 13124 (termed PlyCM) was chosen for recombinant analysis based on its dissimilarity to previously characterized C. perfringens lysins. Following expression and purification, various biochemical properties of PlyCM were determined in vitro, including pH/salt-dependence and temperature stability. The enzyme exhibited activity at low μg/ml concentrations, a typical value for phage lysins. It was active against 23 of 24 strains of C. perfringens tested, with virtually no activity against other clostridial or non-clostridial species. Overall, PlyCM shows potential for development as an enzybiotic agent, demonstrating how expanding genomic databases can serve as rich pools for biotechnologically relevant proteins.

  5. Pan-Cancer Analysis of Genomic Sequencing Among the Elderly.

    Science.gov (United States)

    Wahl, Daniel R; Nguyen, Paul L; Santiago, Maria; Yousefi, Kasra; Davicioni, Elai; Shumway, Dean A; Speers, Corey; Mehra, Rohit; Feng, Felix Y; Osborne, Joseph R; Spratt, Daniel E

    2017-07-15

    We hypothesized that elderly patients might have age-specific genetic abnormalities yet be underrepresented in currently available sequencing repositories, which could limit the effect of sequencing efforts for this population. Leveraging The Cancer Genome Atlas (TCGA) data portal, 9 tumor types were analyzed. The frequency distribution of cancer by age was determined and compared with Surveillance, Epidemiology, and End Results data. Using the estimated median somatic mutational frequency of each tumor type, the samples needed beyond TCGA to detect a 10% mutational frequency were calculated. Microarray data from a separate prospective cohort were obtained from primary prostatectomy samples to determine whether elderly-specific transcriptomic alterations could be identified. Of the 5236 TCGA samples, 73% were from patients aged elderly patients with cancer were likely to harbor age-specific molecular abnormalities, we accessed transcriptomic data from a separate, larger database of >2000 prostate cancer samples. That analysis revealed significant differences in the expression of 10 genes in patients aged ≥70 years compared with those Elderly patients have been underrepresented in genomic sequencing studies. Our data suggest the presence of elderly-specific molecular alterations. Further dedicated efforts to understand the biology of cancer among the elderly will be important moving forward. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Obstruent-sonorant consonant sequences-Analysis by synthesis

    Science.gov (United States)

    Mou, Xiaomin

    2004-05-01

    The goal of this work is to develop principles of overlapping gestures in obstruent-sonorant sequences in the word-initial position and sonorant-obstruent sequences in the word-final position. Consonant clusters such as sm in small are phonetically represented as a sequence of individual elements, but the exact perceptual representation is unclear. The modification during the production of these overlapping gestures may be driven partly by perceptual salience and partly by vocal tract aerodynamics. When two consonants occur next to each other, the same gestures may be made as for only one consonant. The aerodynamics of the vocal tract may account for the modification in the timing of the articulators during production and this modification can be incorporated as rules into HLsyn, a higher-level quasiarticulatory speech synthesizer that takes as inputs the pressures and the flows of the vocal tract. Acoustic information extracted from the speech waveform is mapped into inputs for HLsyn. This analysis by synthesis approach is a method to develop a more precise picture of the planning stage during speech production where the acoustic phonetics must be carefully planned and modified to acheive the correct target sounds. [Work funded by a grant provided by NIH.

  7. Integrated visual analysis of protein structures, sequences, and feature data.

    Science.gov (United States)

    Stolte, Christian; Sabir, Kenneth S; Heinrich, Julian; Hammang, Christopher J; Schafferhans, Andrea; O'Donoghue, Seán I

    2015-01-01

    To understand the molecular mechanisms that give rise to a protein's function, biologists often need to (i) find and access all related atomic-resolution 3D structures, and (ii) map sequence-based features (e.g., domains, single-nucleotide polymorphisms, post-translational modifications) onto these structures. To streamline these processes we recently developed Aquaria, a resource offering unprecedented access to protein structure information based on an all-against-all comparison of SwissProt and PDB sequences. In this work, we provide a requirements analysis for several frequently occuring tasks in molecular biology and describe how design choices in Aquaria meet these requirements. Finally, we show how the interface can be used to explore features of a protein and gain biologically meaningful insights in two case studies conducted by domain experts. The user interface design of Aquaria enables biologists to gain unprecedented access to molecular structures and simplifies the generation of insight. The tasks involved in mapping sequence features onto structures can be conducted easier and faster using Aquaria.

  8. Sequence analysis of the Legionella micdadei groELS operon

    DEFF Research Database (Denmark)

    Hindersson, P; Høiby, N; Bangsborg, Jette Marie

    1991-01-01

    A 2.7 kb DNA fragment encoding the 60 kDa common antigen (CA) and a 13 kDa protein of Legionella micdadei was sequenced. Two open reading frames of 57,677 and 10,456 Da were identified, corresponding to the heat shock proteins GroEL and GroES, respectively. Typical -35, -10, and Shine-Dalgarno heat......, Western blot analysis with an L. micdadei specific anti-groEL antibody did not reveal a significant increase in the amount of the GroEL protein during heat shock in L. micdadei or in the recombinant E. coli expressing L. micdadei GroEL....

  9. Determining physical constraints in transcriptional initiationcomplexes using DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Shultzaberger, Ryan K.; Chiang, Derek Y.; Moses, Alan M.; Eisen,Michael B.

    2007-07-01

    Eukaryotic gene expression is often under the control ofcooperatively acting transcription factors whose binding is limited bystructural constraints. By determining these structural constraints, wecan understand the "rules" that define functional cooperativity.Conversely, by understanding the rules of binding, we can inferstructural characteristics. We have developed an information theory basedmethod for approximating the physical limitations of cooperativeinteractions by comparing sequence analysis to microarray expressiondata. When applied to the coordinated binding of the sulfur amino acidregulatory protein Met4 by Cbf1 and Met31, we were able to create acombinatorial model that can correctly identify Met4 regulatedgenes.

  10. A Critical Analysis of Chromotherapy and Its Scientific Evolution

    OpenAIRE

    Azeemi, Samina T. Yousuf; Raza, Mohsin

    2005-01-01

    Chromotherapy is a method of treatment that uses the visible spectrum (colors) of electromagnetic radiation to cure diseases. It is a centuries-old concept used successfully over the years to cure various diseases. We have undertaken a critical analysis of chromotherapy and documented its scientific evolution to date. A few researchers have tried to discover the underlying scientific principles, but without quantitative study. Sufficient published material can be found about the subject t...

  11. Functional Analysis and Evolution Equations Dedicated to Gunter Lumer

    CERN Document Server

    Amann, Herbert; Hieber, Matthias

    2008-01-01

    GA1/4nter Lumer was an outstanding mathematician whose work has great influence on the research community in mathematical analysis and evolution equations. He was at the origin of the breath-taking development the theory of semigroups saw after the pioneering book of Hille and Phillips of 1957. This volume contains invited contributions presenting the state of the art of these topics and reflecting the broad interests of GA1/4nter Lumer.

  12. EVOLUTION AND PERFORMANCE ANALYSIS FOR WINE ENTITIES IN ROMANIA

    OpenAIRE

    Dan Topor; Sorinel Capusneanu; Alina Putan

    2012-01-01

    The article aims to highlight the evolution of wine entities in Romania and theirperformance. Given the state of research conducted in the literature on performance measurementand analysis of the entities from wine sector and the achievements of specialists, the authors of thisarticle demonstrates the importance of using the method of variable costs in terms of its specificindicators and making any decisions based on information provided by them. The article ends withthe authors' conclusions ...

  13. Comprehensive analysis of expressed sequence tags from cultivated and wild radish (Raphanus spp.).

    Science.gov (United States)

    Shen, Di; Sun, Honghe; Huang, Mingyun; Zheng, Yi; Qiu, Yang; Li, Xixiang; Fei, Zhangjun

    2013-10-21

    Radish (Raphanus sativus L., 2n = 2× = 18) is an economically important vegetable crop worldwide. A large collection of radish expressed sequence tags (ESTs) has been generated but remains largely uncharacterized. In this study, approximately 315,000 ESTs derived from 22 Raphanus cDNA libraries from 18 different genotypes were analyzed, for the purpose of gene and marker discovery and to evaluate large-scale genome duplication and phylogenetic relationships among Raphanus spp. The ESTs were assembled into 85,083 unigenes, of which 90%, 65%, 89% and 89% had homologous sequences in the GenBank nr, SwissProt, TrEMBL and Arabidopsis protein databases, respectively. A total of 66,194 (78%) could be assigned at least one gene ontology (GO) term. Comparative analysis identified 5,595 gene families unique to radish that were significantly enriched with genes related to small molecule metabolism, as well as 12,899 specific to the Brassicaceae that were enriched with genes related to seed oil body biogenesis and responses to phytohormones. The analysis further indicated that the divergence of radish and Brassica rapa occurred approximately 8.9-14.9 million years ago (MYA), following a whole-genome duplication event (12.8-21.4 MYA) in their common ancestor. An additional whole-genome duplication event in radish occurred at 5.1-8.4 MYA, after its divergence from B. rapa. A total of 13,570 simple sequence repeats (SSRs) and 28,758 high-quality single nucleotide polymorphisms (SNPs) were also identified. Using a subset of SNPs, the phylogenetic relationships of eight different accessions of Raphanus was inferred. Comprehensive analysis of radish ESTs provided new insights into radish genome evolution and the phylogenetic relationships of different radish accessions. Moreover, the radish EST sequences and the associated SSR and SNP markers described in this study represent a valuable resource for radish functional genomics studies and breeding.

  14. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    Energy Technology Data Exchange (ETDEWEB)

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  15. A Comprehensive Phylogenetic Analysis of the Scleractinia (Cnidaria, Anthozoa) Based on Mitochondrial CO1 Sequence Data

    Science.gov (United States)

    Kitahara, Marcelo V.; Cairns, Stephen D.; Stolarski, Jarosław; Blair, David; Miller, David J.

    2010-01-01

    Background Classical morphological taxonomy places the approximately 1400 recognized species of Scleractinia (hard corals) into 27 families, but many aspects of coral evolution remain unclear despite the application of molecular phylogenetic methods. In part, this may be a consequence of such studies focusing on the reef-building (shallow water and zooxanthellate) Scleractinia, and largely ignoring the large number of deep-sea species. To better understand broad patterns of coral evolution, we generated molecular data for a broad and representative range of deep sea scleractinians collected off New Caledonia and Australia during the last decade, and conducted the most comprehensive molecular phylogenetic analysis to date of the order Scleractinia. Methodology Partial (595 bp) sequences of the mitochondrial cytochrome oxidase subunit 1 (CO1) gene were determined for 65 deep-sea (azooxanthellate) scleractinians and 11 shallow-water species. These new data were aligned with 158 published sequences, generating a 234 taxon dataset representing 25 of the 27 currently recognized scleractinian families. Principal Findings/Conclusions There was a striking discrepancy between the taxonomic validity of coral families consisting predominantly of deep-sea or shallow-water species. Most families composed predominantly of deep-sea azooxanthellate species were monophyletic in both maximum likelihood and Bayesian analyses but, by contrast (and consistent with previous studies), most families composed predominantly of shallow-water zooxanthellate taxa were polyphyletic, although Acroporidae, Poritidae, Pocilloporidae, and Fungiidae were exceptions to this general pattern. One factor contributing to this inconsistency may be the greater environmental stability of deep-sea environments, effectively removing taxonomic “noise” contributed by phenotypic plasticity. Our phylogenetic analyses imply that the most basal extant scleractinians are azooxanthellate solitary corals from deep

  16. A comprehensive phylogenetic analysis of the Scleractinia (Cnidaria, Anthozoa based on mitochondrial CO1 sequence data.

    Directory of Open Access Journals (Sweden)

    Marcelo V Kitahara

    Full Text Available BACKGROUND: Classical morphological taxonomy places the approximately 1400 recognized species of Scleractinia (hard corals into 27 families, but many aspects of coral evolution remain unclear despite the application of molecular phylogenetic methods. In part, this may be a consequence of such studies focusing on the reef-building (shallow water and zooxanthellate Scleractinia, and largely ignoring the large number of deep-sea species. To better understand broad patterns of coral evolution, we generated molecular data for a broad and representative range of deep sea scleractinians collected off New Caledonia and Australia during the last decade, and conducted the most comprehensive molecular phylogenetic analysis to date of the order Scleractinia. METHODOLOGY: Partial (595 bp sequences of the mitochondrial cytochrome oxidase subunit 1 (CO1 gene were determined for 65 deep-sea (azooxanthellate scleractinians and 11 shallow-water species. These new data were aligned with 158 published sequences, generating a 234 taxon dataset representing 25 of the 27 currently recognized scleractinian families. PRINCIPAL FINDINGS/CONCLUSIONS: There was a striking discrepancy between the taxonomic validity of coral families consisting predominantly of deep-sea or shallow-water species. Most families composed predominantly of deep-sea azooxanthellate species were monophyletic in both maximum likelihood and Bayesian analyses but, by contrast (and consistent with previous studies, most families composed predominantly of shallow-water zooxanthellate taxa were polyphyletic, although Acroporidae, Poritidae, Pocilloporidae, and Fungiidae were exceptions to this general pattern. One factor contributing to this inconsistency may be the greater environmental stability of deep-sea environments, effectively removing taxonomic "noise" contributed by phenotypic plasticity. Our phylogenetic analyses imply that the most basal extant scleractinians are azooxanthellate solitary

  17. A comprehensive phylogenetic analysis of the Scleractinia (Cnidaria, Anthozoa) based on mitochondrial CO1 sequence data.

    Science.gov (United States)

    Kitahara, Marcelo V; Cairns, Stephen D; Stolarski, Jarosław; Blair, David; Miller, David J

    2010-07-08

    Classical morphological taxonomy places the approximately 1400 recognized species of Scleractinia (hard corals) into 27 families, but many aspects of coral evolution remain unclear despite the application of molecular phylogenetic methods. In part, this may be a consequence of such studies focusing on the reef-building (shallow water and zooxanthellate) Scleractinia, and largely ignoring the large number of deep-sea species. To better understand broad patterns of coral evolution, we generated molecular data for a broad and representative range of deep sea scleractinians collected off New Caledonia and Australia during the last decade, and conducted the most comprehensive molecular phylogenetic analysis to date of the order Scleractinia. Partial (595 bp) sequences of the mitochondrial cytochrome oxidase subunit 1 (CO1) gene were determined for 65 deep-sea (azooxanthellate) scleractinians and 11 shallow-water species. These new data were aligned with 158 published sequences, generating a 234 taxon dataset representing 25 of the 27 currently recognized scleractinian families. There was a striking discrepancy between the taxonomic validity of coral families consisting predominantly of deep-sea or shallow-water species. Most families composed predominantly of deep-sea azooxanthellate species were monophyletic in both maximum likelihood and Bayesian analyses but, by contrast (and consistent with previous studies), most families composed predominantly of shallow-water zooxanthellate taxa were polyphyletic, although Acroporidae, Poritidae, Pocilloporidae, and Fungiidae were exceptions to this general pattern. One factor contributing to this inconsistency may be the greater environmental stability of deep-sea environments, effectively removing taxonomic "noise" contributed by phenotypic plasticity. Our phylogenetic analyses imply that the most basal extant scleractinians are azooxanthellate solitary corals from deep-water, their divergence predating that of the robust and

  18. General continuous-time Markov model of sequence evolution via insertions/deletions: local alignment probability computation.

    Science.gov (United States)

    Ezawa, Kiyoshi

    2016-09-27

    Insertions and deletions (indels) account for more nucleotide differences between two related DNA sequences than substitutions do, and thus it is imperative to develop a method to reliably calculate the occurrence probabilities of sequence alignments via evolutionary processes on an entire sequence. Previously, we presented a perturbative formulation that facilitates the ab initio calculation of alignment probabilities under a continuous-time Markov model, which describes the stochastic evolution of an entire sequence via indels with quite general rate parameters. And we demonstrated that, under some conditions, the ab initio probability of an alignment can be factorized into the product of an overall factor and contributions from regions (or local alignments) delimited by gapless columns. Here, using our formulation, we attempt to approximately calculate the probabilities of local alignments under space-homogeneous cases. First, for each of all types of local pairwise alignments (PWAs) and some typical types of local multiple sequence alignments (MSAs), we numerically computed the total contribution from all parsimonious indel histories and that from all next-parsimonious histories, and compared them. Second, for some common types of local PWAs, we derived two integral equation systems that can be numerically solved to give practically exact solutions. We compared the total parsimonious contribution with the practically exact solution for each such local PWA. Third, we developed an algorithm that calculates the first-approximate MSA probability by multiplying total parsimonious contributions from all local MSAs. Then we compared the first-approximate probability of each local MSA with its absolute frequency in the MSAs created via a genuine sequence evolution simulator, Dawg. In all these analyses, the total parsimonious contributions approximated the multiplication factors fairly well, as long as gap sizes and branch lengths are at most moderate. Examination of

  19. The map-based genome sequence of Spirodela polyrhiza aligned with its chromosomes, a reference for karyotype evolution.

    Science.gov (United States)

    Cao, Hieu Xuan; Vu, Giang Thi Ha; Wang, Wenqin; Appenroth, Klaus J; Messing, Joachim; Schubert, Ingo

    2016-01-01

    Duckweeds are aquatic monocotyledonous plants of potential economic interest with fast vegetative propagation, comprising 37 species with variable genome sizes (0.158-1.88 Gbp). The genomic sequence of Spirodela polyrhiza, the smallest and the most ancient duckweed genome, needs to be aligned to its chromosomes as a reference and prerequisite to study the genome and karyotype evolution of other duckweed species. We selected physically mapped bacterial artificial chromosomes (BACs) containing Spirodela DNA inserts with little or no repetitive elements as probes for multicolor fluorescence in situ hybridization (mcFISH), using an optimized BAC pooling strategy, to validate its physical map and correlate it with its chromosome complement. By consecutive mcFISH analyses, we assigned the originally assembled 32 pseudomolecules (supercontigs) of the genomic sequences to the 20 chromosomes of S. polyrhiza. A Spirodela cytogenetic map containing 96 BAC markers with an average distance of 0.89 Mbp was constructed. Using a cocktail of 41 BACs in three colors, all chromosome pairs could be individualized simultaneously. Seven ancestral blocks emerged from duplicated chromosome segments of 19 Spirodela chromosomes. The chromosomally integrated genome of S. polyrhiza and the established prerequisites for comparative chromosome painting enable future studies on the chromosome homoeology and karyotype evolution of duckweed species. © 2015 IPK Gatersleben. New Phytologist © 2015 New Phytologist Trust.

  20. Seismic and sequence stratigraphy of the central western continental margin of India: late-Quaternary evolution

    Digital Repository Service at National Institute of Oceanography (India)

    Karisiddaiah, S.M.; Veerayya, M.; Vora, K.H.

    been applied to the study of Quaternary deposits through the use of high-resolution seismic-re£ec- tion pro¢les and samples (Shideler et al., 1984; Morton and Price, 1987; Boyd et al., 1989; Her- nandez-Molina et al., 1994, 1996, 2000; Somoza 0025...; late-Quaternary; seismic and sequence stratigraphy; carbonate to siliciclastic shelf 1. Introduction The conceptual model of sequence stratigraphy (Mitchum et al., 1977; Vail et al., 1977; van Wag- oner et al., 1988; Schlager, 1991) has increasingly...

  1. The Cenozoic geological evolution of the Central and Northern North Sea based on seismic sequence stratigraphy

    Energy Technology Data Exchange (ETDEWEB)

    Jordt, Henrik

    1996-03-01

    This thesis represents scientific results from seismic sequence stratigraphic investigations. These investigations and results are integrated into an ongoing mineralogical study of the Cenozoic deposits. the main results from this mineralogical study are presented and discussed. The seismic investigations have provided boundary conditions for a forward modelling study of the Cenozoic depositional history. Results from the forward modelling are presented as they emphasise the influence of tectonics on sequence development. The tectonic motions described were important for the formation of the large oil and gas fields in the North Sea.

  2. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Science.gov (United States)

    Issa, Shadi A.; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J.; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation. PMID:23710461

  3. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Shadi A. Issa

    2013-01-01

    Full Text Available Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client’s site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  4. Streaming support for data intensive cloud-based sequence analysis.

    Science.gov (United States)

    Issa, Shadi A; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of "resources-on-demand" and "pay-as-you-go", scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  5. Next-generation sequence analysis of cancer xenograft models.

    Directory of Open Access Journals (Sweden)

    Fernando J Rossello

    Full Text Available Next-generation sequencing (NGS studies in cancer are limited by the amount, quality and purity of tissue samples. In this situation, primary xenografts have proven useful preclinical models. However, the presence of mouse-derived stromal cells represents a technical challenge to their use in NGS studies. We examined this problem in an established primary xenograft model of small cell lung cancer (SCLC, a malignancy often diagnosed from small biopsy or needle aspirate samples. Using an in silico strategy that assign reads according to species-of-origin, we prospectively compared NGS data from primary xenograft models with matched cell lines and with published datasets. We show here that low-coverage whole-genome analysis demonstrated remarkable concordance between published genome data and internal controls, despite the presence of mouse genomic DNA. Exome capture sequencing revealed that this enrichment procedure was highly species-specific, with less than 4% of reads aligning to the mouse genome. Human-specific expression profiling with RNA-Seq replicated array-based gene expression experiments, whereas mouse-specific transcript profiles correlated with published datasets from human cancer stroma. We conclude that primary xenografts represent a useful platform for complex NGS analysis in cancer research for tumours with limited sample resources, or those with prominent stromal cell populations.

  6. Cambro-ordovician sea-level fluctuations and sequence boundaries: The missing record and the evolution of new taxa

    Science.gov (United States)

    Lehnert, O.; Miller, J.F.; Leslie, Stephen A.; Repetski, J.E.; Ethington, Raymond L.

    2005-01-01

    The evolution of early Palaeozoic conodont faunas shows a clear connection to sea-level changes. One way that this connection manifests itself is that thick successions of carbonates are missing beneath major sequence boundaries due to karstification and erosion. From this observation arises the question of how many taxa have been lost from different conodont lineages in these incomplete successions. Although many taxa suffered extinction due to the environmental stresses associated with falling sea-levels, some must have survived in these extreme conditions. The number of taxa missing in the early Palaeozoic tropics always will be unclear, but it will be even more difficult to evaluate the missing record in detrital successions of higher latitudes. A common pattern in the evolution of Cambrian-Ordovician conodont lineages is appearances of new species at sea-level rises and disappearances at sea-level drops. This simple picture can be complicated by intervals that consistently have no representatives of a particular lineage, even after extensive sampling of the most complete sections. Presumably the lineages survived in undocumented refugia. In this paper, we give examples of evolution in Cambrian-Ordovician shallowmarine conodont faunas and highlight problems of undiscovered or truly missing segments of lineages. ?? The Palaeontological Association.

  7. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences

    OpenAIRE

    Schwartz, Scott; Elnitski, Laura; Li, Mei; Weirauch, Matt; Riemer, Cathy; Smit, Arian; Green, Eric D.; Hardison, Ross C.; Miller, Webb

    2003-01-01

    Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs includ...

  8. Analysis of the Science and Technology Preservice Teachers' Opinions on Teaching Evolution and Theory of Evolution

    Science.gov (United States)

    Töman, Ufuk; Karatas, Faik Özgür; Çimer, Sabiha Odabasi

    2014-01-01

    In this study, we investigate of science and technology teachers' opinions about the theory of evolution and the evolution teaching. The aim of this study, we investigate of science and technology teachers' opinions about the theory of evolution and the evolution teaching. This study is a descriptive study. Open-ended questions were used to…

  9. Delayed Gratification Habitable Zones: When Deep Outer Solar System Regions Become Balmy During Post-Main Sequence Stellar Evolution

    Science.gov (United States)

    Stern, S. Alan

    2003-06-01

    Like all low- and moderate-mass stars, the Sun will burn as a red giant during its later evolution, generating of solar luminosities for some tens of millions of years. During this post-main sequence phase, the habitable (i.e., liquid water) thermal zone of our Solar System will lie in the region where Triton, Pluto-Charon, and Kuiper Belt objects orbit. Compared with the 1 AU habitable zone where Earth resides, this "delayed gratification habitable zone" (DGHZ) will enjoy a far less biologically hazardous environment - with lower harmful radiation levels from the Sun, and a far less destructive collisional environment. Objects like Triton, Pluto-Charon, and Kuiper Belt objects, which are known to be rich in both water and organics, will then become possible sites for biochemical and perhaps even biological evolution. The Kuiper Belt, with >105 objects >=50 km in radius and more than three times the combined surface area of the four terrestrial planets, provides numerous sites for possible evolution once the Sun's DGHZ reaches it. The Sun's DGHZ might be thought to only be of academic interest owing to its great separation from us in time. However, ~109 Milky Way stars burn as luminous red giants today. Thus, if icy-organic objects are common in the 20-50 AU zones of these stars, as they are in our Solar System (and as inferred in numerous main sequence stellar disk systems), then DGHZs may form a niche type of habitable zone that is likely to be numerically common in the Galaxy.

  10. Interference with histidyl-tRNA synthetase by a CRISPR spacer sequence as a factor in the evolution of Pelobacter carbinolicus

    Science.gov (United States)

    2010-01-01

    Background Pelobacter carbinolicus, a bacterium of the family Geobacteraceae, cannot reduce Fe(III) directly or produce electricity like its relatives. How P. carbinolicus evolved is an intriguing problem. The genome of P. carbinolicus contains clustered regularly interspaced short palindromic repeats (CRISPR) separated by unique spacer sequences, which recent studies have shown to produce RNA molecules that interfere with genes containing identical sequences. Results CRISPR spacer #1, which matches a sequence within hisS, the histidyl-tRNA synthetase gene of P. carbinolicus, was shown to be expressed. Phylogenetic analysis and genetics demonstrated that a gene paralogous to hisS in the genomes of Geobacteraceae is unlikely to compensate for interference with hisS. Spacer #1 inhibited growth of a transgenic strain of Geobacter sulfurreducens in which the native hisS was replaced with that of P. carbinolicus. The prediction that interference with hisS would result in an attenuated histidyl-tRNA pool insufficient for translation of proteins with multiple closely spaced histidines, predisposing them to mutation and elimination during evolution, was investigated by comparative genomics of P. carbinolicus and related species. Several ancestral genes with high histidine demand have been lost or modified in the P. carbinolicus lineage, providing an explanation for its physiological differences from other Geobacteraceae. Conclusions The disappearance of multiheme c-type cytochromes and other genes typical of a metal-respiring ancestor from the P. carbinolicus lineage may be the consequence of spacer #1 interfering with hisS, a condition that can be reproduced in a heterologous host. This is the first successful co-introduction of an active CRISPR spacer and its target in the same cell, the first application of a chimeric CRISPR construct consisting of a spacer from one species in the context of repeats of another species, and the first report of a potential impact of

  11. Interference with histidyl-tRNA synthetase by a CRISPR spacer sequence as a factor in the evolution of Pelobacter carbinolicus

    Directory of Open Access Journals (Sweden)

    Lovley Derek R

    2010-07-01

    Full Text Available Abstract Background Pelobacter carbinolicus, a bacterium of the family Geobacteraceae, cannot reduce Fe(III directly or produce electricity like its relatives. How P. carbinolicus evolved is an intriguing problem. The genome of P. carbinolicus contains clustered regularly interspaced short palindromic repeats (CRISPR separated by unique spacer sequences, which recent studies have shown to produce RNA molecules that interfere with genes containing identical sequences. Results CRISPR spacer #1, which matches a sequence within hisS, the histidyl-tRNA synthetase gene of P. carbinolicus, was shown to be expressed. Phylogenetic analysis and genetics demonstrated that a gene paralogous to hisS in the genomes of Geobacteraceae is unlikely to compensate for interference with hisS. Spacer #1 inhibited growth of a transgenic strain of Geobacter sulfurreducens in which the native hisS was replaced with that of P. carbinolicus. The prediction that interference with hisS would result in an attenuated histidyl-tRNA pool insufficient for translation of proteins with multiple closely spaced histidines, predisposing them to mutation and elimination during evolution, was investigated by comparative genomics of P. carbinolicus and related species. Several ancestral genes with high histidine demand have been lost or modified in the P. carbinolicus lineage, providing an explanation for its physiological differences from other Geobacteraceae. Conclusions The disappearance of multiheme c-type cytochromes and other genes typical of a metal-respiring ancestor from the P. carbinolicus lineage may be the consequence of spacer #1 interfering with hisS, a condition that can be reproduced in a heterologous host. This is the first successful co-introduction of an active CRISPR spacer and its target in the same cell, the first application of a chimeric CRISPR construct consisting of a spacer from one species in the context of repeats of another species, and the first report of

  12. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Nucleotides and deduced amino acid sequence comparison of six isolates was performed with each other and with two HCV genotype 3a type examples reported from Japan. Phylogenetic tree of HCV core sequences was constructed using CLC software. Nucleotides sequence comparison showed that our sequences ...

  13. Fault transmissibility in clastic-argillaceous sequences controlled by clay smear evolution

    NARCIS (Netherlands)

    Heege, J.H. ter; Giger, S.B.; Clennell, M.B.; Ciftci, N.B.

    2012-01-01

    The mechanical entrainment of clays and shales in fault zones of sedimentary sequences can exert fundamental control on fault-permeability. To estimate the influence of entrained clay bands on fluid flow, semi-empirical fault-seal algorithms are being used for hydrocarbon exploration (e.g. Yielding

  14. Analysis on evolution and research focus in psychiatry field.

    Science.gov (United States)

    Wu, Ying; Duan, Zhiguang

    2015-05-07

    With the dramatic rise in mental disorders and mental illnesses, psychiatry has become one of the fastest growing clinical medical disciplines. This has led to a rise in the number of scientific research papers being published in this field. We selected research papers in ten psychiatric journals that were published during 1983 to 2012. These ten journals were those with the top Impact Factor (IF) as indicated by the Science Citation Index Expanded (SCI-Expanded). We utilized information visualization software (CiteSpace) to conduct co-citation and Hierarchal clustering analysis to map knowledge domains to determine the evolution and the foci of research in this field. In the evolution of the field of psychiatry, there were four stages identified. The result of hierarchal clustering analysis revealed that the research foci in the psychiatric field were primarily studies of child and adolescent psychiatry, diagnostic and classification criteria, brain imaging and molecular genetics. The results provide information about the evolution and the foci of the research in the field of psychiatry. This information can help researchers determine the direction of the research in the field of psychiatry; Moreover, this research provides reasonable suggestions to guide research in psychiatry field and provide scientific evidence to aid in the effective prevention and treatment of mental disorders.

  15. A novel method for comparative analysis of DNA sequences by Ramanujan-Fourier transform.

    Science.gov (United States)

    Yin, Changchuan; Yin, Xuemeng E; Wang, Jiasong

    2014-12-01

    Alignment-free sequence analysis approaches provide important alternatives over multiple sequence alignment (MSA) in biological sequence analysis because alignment-free approaches have low computation complexity and are not dependent on high level of sequence identity. However, most of the existing alignment-free methods do not employ true full information content of sequences and thus can not accurately reveal similarities and differences among DNA sequences. We present a novel alignment-free computational method for sequence analysis based on Ramanujan-Fourier transform (RFT), in which complete information of DNA sequences is retained. We represent DNA sequences as four binary indicator sequences and apply RFT on the indicator sequences to convert them into frequency domain. The Euclidean distance of the complete RFT coefficients of DNA sequences are used as similarity measures. To address the different lengths of RFT coefficients in Euclidean space, we pad zeros to short DNA binary sequences so that the binary sequences equal the longest length in the comparison sequence data. Thus, the DNA sequences are compared in the same dimensional frequency space without information loss. We demonstrate the usefulness of the proposed method by presenting experimental results on hierarchical clustering of genes and genomes. The proposed method opens a new channel to biological sequence analysis, classification, and structural module identification.

  16. Tracking the evolution of sex chromosome systems in Melanoplinae grasshoppers through chromosomal mapping of repetitive DNA sequences.

    Science.gov (United States)

    Palacios-Gimenez, Octavio M; Castillo, Elio R; Martí, Dardo A; Cabral-de-Mello, Diogo C

    2013-08-09

    The accumulation of repetitive DNA during sex chromosome differentiation is a common feature of many eukaryotes and becomes more evident after recombination has been restricted or abolished. The accumulated repetitive sequences include multigene families, microsatellites, satellite DNAs and mobile elements, all of which are important for the structural remodeling of heterochromatin. In grasshoppers, derived sex chromosome systems, such as neo-XY♂/XX♀ and neo-X1X2Y♂/X1X1X2X2♀, are frequently observed in the Melanoplinae subfamily. However, no studies concerning the evolution of sex chromosomes in Melanoplinae have addressed the role of the repetitive DNA sequences. To further investigate the evolution of sex chromosomes in grasshoppers, we used classical cytogenetic and FISH analyses to examine the repetitive DNA sequences in six phylogenetically related Melanoplinae species with X0♂/XX♀, neo-XY♂/XX♀ and neo-X1X2Y♂/X1X1X2X2♀ sex chromosome systems. Our data indicate a non-spreading of heterochromatic blocks and pool of repetitive DNAs (C0t-1 DNA) in the sex chromosomes; however, the spreading of multigene families among the neo-sex chromosomes of Eurotettix and Dichromatos was remarkable, particularly for 5S rDNA. In autosomes, FISH mapping of multigene families revealed distinct patterns of chromosomal organization at the intra- and intergenomic levels. These results suggest a common origin and subsequent differential accumulation of repetitive DNAs in the sex chromosomes of Dichromatos and an independent origin of the sex chromosomes of the neo-XY and neo-X1X2Y systems. Our data indicate a possible role for repetitive DNAs in the diversification of sex chromosome systems in grasshoppers.

  17. Reverse transcriptase domain sequences from tree peony (Paeonia suffruticosa) long terminal repeat retrotransposons: sequence characterization and phylogenetic analysis.

    Science.gov (United States)

    Guo, Da-Long; Hou, Xiao-Gai; Jia, Tian

    2014-05-04

    Tree peony is an important horticultural plant worldwide of great ornamental and medicinal value. Long terminal repeat retrotransposons (LTR-retrotransposons) are the major components of most plant genomes and can substantially impact the genome in many ways. It is therefore crucial to understand their sequence characteristics, genetic distribution and transcriptional activity; however, no information about them is available in tree peony. Ty1-copia-like reverse transcriptase sequences were amplified from tree peony genomic DNA by polymerase chain reaction (PCR) with degenerate oligonucleotide primers corresponding to highly conserved domains of the Ty1-copia-like retrotransposons in this study. PCR fragments of roughly 270 bp were isolated and cloned, and 33 sequences were obtained. According to alignment and phylogenetic analysis, all sequences were divided into six families. The observed difference in the degree of nucleotide sequence similarity is an indication for high level of sequence heterogeneity among these clones. Most of these sequences have a frame shift, a stop codon, or both. Dot-blot analysis revealed distribution of these sequences in all the studied tree peony species. However, different hybridization signals were detected among them, which is in agreement with previous systematics studies. Reverse transcriptase PCR (RT-PCR) indicated that Ty1-copia retrotransposons in tree peony were transcriptionally inactive. The results provide basic genetic and evolutionary information of tree peony genome, and will provide valuable information for the further utilization of retrotransposons in tree peony.

  18. Insights into a dinoflagellate genome through expressed sequence tag analysis

    Directory of Open Access Journals (Sweden)

    Bonaldo Maria F

    2005-05-01

    Full Text Available Abstract Background Dinoflagellates are important marine primary producers and grazers and cause toxic "red tides". These taxa are characterized by many unique features such as immense genomes, the absence of nucleosomes, and photosynthetic organelles (plastids that have been gained and lost multiple times. We generated EST sequences from non-normalized and normalized cDNA libraries from a culture of the toxic species Alexandrium tamarense to elucidate dinoflagellate evolution. Previous analyses of these data have clarified plastid origin and here we study the gene content, annotate the ESTs, and analyze the genes that are putatively involved in DNA packaging. Results Approximately 20% of the 6,723 unique (11,171 total 3'-reads ESTs data could be annotated using Blast searches against GenBank. Several putative dinoflagellate-specific mRNAs were identified, including one novel plastid protein. Dinoflagellate genes, similar to other eukaryotes, have a high GC-content that is reflected in the amino acid codon usage. Highly represented transcripts include histone-like (HLP and luciferin binding proteins and several genes occur in families that encode nearly identical proteins. We also identified rare transcripts encoding a predicted protein highly similar to histone H2A.X. We speculate this histone may be retained for its role in DNA double-strand break repair. Conclusion This is the most extensive collection to date of ESTs from a toxic dinoflagellate. These data will be instrumental to future research to understand the unique and complex cell biology of these organisms and for potentially identifying the genes involved in toxin production.

  19. Radar image sequence analysis of inhomogeneous water surfaces

    Science.gov (United States)

    Seemann, Joerg; Senet, Christian M.; Dankert, Heiko; Hatten, Helge; Ziemer, Friedwart

    1999-10-01

    The radar backscatter from the ocean surface, called sea clutter, is modulated by the surface wave field. A method was developed to estimate the near-surface current, the water depth and calibrated surface wave spectra from nautical radar image sequences. The algorithm is based on the three- dimensional Fast Fourier Transformation (FFT) of the spatio- temporal sea clutter pattern in the wavenumber-frequency domain. The dispersion relation is used to define a filter to separate the spectral signal of the imaged waves from the background noise component caused by speckle noise. The signal-to-noise ratio (SNR) contains information about the significant wave height. The method has been proved to be reliable for the analysis of homogeneous water surfaces in offshore installations. Radar images are inhomogeneous because of the dependency of the image transfer function (ITF) on the azimuth angle between the wave propagation and the antenna viewing direction. The inhomogeneity of radar imaging is analyzed using image sequences of a homogeneous deep-water surface sampled by a ship-borne radar. Changing water depths in shallow-water regions induce horizontal gradients of the tidal current. Wave refraction occurs due to the spatial variability of the current and water depth. These areas cannot be investigated with the standard method. A new method, based on local wavenumber estimation with the multiple-signal classification (MUSIC) algorithm, is outlined. The MUSIC algorithm provides superior wavenumber resolution on local spatial scales. First results, retrieved from a radar image sequence taken from an installation at a coastal site, are presented.

  20. First complete mitochondrial genome sequence from a box jellyfish reveals a highly fragmented linear architecture and insights into telomere evolution.

    Science.gov (United States)

    Smith, David Roy; Kayal, Ehsan; Yanagihara, Angel A; Collins, Allen G; Pirro, Stacy; Keeling, Patrick J

    2012-01-01

    Animal mitochondrial DNAs (mtDNAs) are typically single circular chromosomes, with the exception of those from medusozoan cnidarians (jellyfish and hydroids), which are linear and sometimes fragmented. Most medusozoans have linear monomeric or linear bipartite mitochondrial genomes, but preliminary data have suggested that box jellyfish (cubozoans) have mtDNAs that consist of many linear chromosomes. Here, we present the complete mtDNA sequence from the winged box jellyfish Alatina moseri (the first from a cubozoan). This genome contains unprecedented levels of fragmentation: 18 unique genes distributed over eight 2.9- to 4.6-kb linear chromosomes. The telomeres are identical within and between chromosomes, and recombination between subtelomeric sequences has led to many genes initiating or terminating with sequences from other genes (the most extreme case being 150 nt of a ribosomal RNA containing the 5' end of nad2), providing evidence for a gene conversion-based model of telomere evolution. The silent-site nucleotide variation within the A. moseri mtDNA is among the highest observed from a eukaryotic genome and may be associated with elevated rates of recombination.

  1. Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution

    Science.gov (United States)

    Smith, Jeramiah J; Kuraku, Shigehiro; Holt, Carson; Sauka-Spengler, Tatjana; Jiang, Ning; Campbell, Michael S; Yandell, Mark D; Manousaki, Tereza; Meyer, Axel; Bloom, Ona E; Morgan, Jennifer R; Buxbaum, Joseph D; Sachidanandam, Ravi; Sims, Carrie; Garruss, Alexander S; Cook, Malcolm; Krumlauf, Robb; Wiedemann, Leanne M; Sower, Stacia A; Decatur, Wayne A; Hall, Jeffrey A; Amemiya, Chris T; Saha, Nil R; Buckley, Katherine M; Rast, Jonathan P; Das, Sabyasachi; Hirano, Masayuki; McCurley, Nathanael; Guo, Peng; Rohner, Nicolas; Tabin, Clifford J; Piccinelli, Paul; Elgar, Greg; Ruffier, Magali; Aken, Bronwen L; Searle, Stephen MJ; Muffato, Matthieu; Pignatelli, Miguel; Herrero, Javier; Jones, Matthew; Brown, C Titus; Chung-Davidson, Yu-Wen; Nanlohy, Kaben G; Libants, Scot V; Yeh, Chu-Yin; McCauley, David W; Langeland, James A; Pancer, Zeev; Fritzsch, Bernd; de Jong, Pieter J; Zhu, Baoli; Fulton, Lucinda L; Theising, Brenda; Flicek, Paul; Bronner, Marianne E; Warren, Wesley C; Clifton, Sandra W; Wilson, Richard K; Li, Weiming

    2013-01-01

    Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ~500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms. PMID:23435085

  2. Sequence analysis of a canine parvovirus isolated from a red panda (Ailurus fulgens) in China.

    Science.gov (United States)

    Qin, Qin; Loeffler, I Kati; Li, Ming; Tian, Kegong; Wei, Fuwen

    2007-06-01

    Canine parvovirus (CPV) was first recognized in the late 1970 s in dogs and has mutated and spread throughout the world in canid and felid species since then. In this study, a novel CPV was isolated from the endangered red panda (Ailurus fulgens) in China. Nucleotide and phylogenetic analysis of the capsid protein VP2 gene classified the red panda parvovirus (RPPV) as a CPV-2a type. Substitution of Val for Gly at the conserved 300 residue in RPPV presents an unusual variation in the CPV-2a amino acid sequence and is further evidence for the continuing evolution of the virus. The 300 residue is important in distinguishing the antigenicity and host range of CPVs. The clinical significance and population impact of RPPV infection in captive red pandas in China is unknown and is an important topic for future research.

  3. Early evolution of conserved regulatory sequences associated with development in vertebrates.

    Directory of Open Access Journals (Sweden)

    Gayle K McEwen

    2009-12-01

    Full Text Available Comparisons between diverse vertebrate genomes have uncovered thousands of highly conserved non-coding sequences, an increasing number of which have been shown to function as enhancers during early development. Despite their extreme conservation over 500 million years from humans to cartilaginous fish, these elements appear to be largely absent in invertebrates, and, to date, there has been little understanding of their mode of action or the evolutionary processes that have modelled them. We have now exploited emerging genomic sequence data for the sea lamprey, Petromyzon marinus, to explore the depth of conservation of this type of element in the earliest diverging extant vertebrate lineage, the jawless fish (agnathans. We searched for conserved non-coding elements (CNEs at 13 human gene loci and identified lamprey elements associated with all but two of these gene regions. Although markedly shorter and less well conserved than within jawed vertebrates, identified lamprey CNEs are able to drive specific patterns of expression in zebrafish embryos, which are almost identical to those driven by the equivalent human elements. These CNEs are therefore a unique and defining characteristic of all vertebrates. Furthermore, alignment of lamprey and other vertebrate CNEs should permit the identification of persistent sequence signatures that are responsible for common patterns of expression and contribute to the elucidation of the regulatory language in CNEs. Identifying the core regulatory code for development, common to all vertebrates, provides a foundation upon which regulatory networks can be constructed and might also illuminate how large conserved regulatory sequence blocks evolve and become fixed in genomic DNA.

  4. Late Neogene Sequence Stratigraphic Evolution of the Foz do Amazonas Basin, Brazil

    Science.gov (United States)

    Gorini, Christian; Haq, Bilal U.; Tadeu dos Reis, Antonio; Guizan Silva, Cleverson; Cruz, Alberto; Soares, Emilson; Grangeon, Didier

    2014-05-01

    The margin of the Foz do Amazonas Basin saw a shift from predominantly carbonate to siliciclastic sedimentation in the early late Miocene. By this time the Amazon shelf had also been incised by a canyon that allowed direct influx of sediment to the basin floor, thus confirming that the paleo-Amazon fan had already initiated by that time (9.5-8.3Ma). Above this interval, during a prolonged lowstand, Messinian third-order sequences are preserved only in the incised-valley fills of the canyon with no equivalent strata on the shelf. Third and fourth-order sequences younger than Messinian are preserved on the shelf after sea-level rise above the shelf by early Pliocene. Sequences younger than 3.8 Ma often show fourth-order cyclicity with average duration of 400 kyr (larger scale eccentricity cycles) often preserved in high sedimentation rate areas of river deltas. Mass wasting and transportation of slope sediments to the basin began to play an important role in sediment dispersal at least as far back as mid Pliocene, after rapid progradation had produced steeper slopes 23 more prone to failure.

  5. Genomic Sequencing and Analysis of Sucra jujuba Nucleopolyhedrovirus

    Science.gov (United States)

    Liu, Xiaoping; Yin, Feifei; Zhu, Zheng; Hou, Dianhai; Wang, Jun; Zhang, Lei; Wang, Manli; Wang, Hualin; Hu, Zhihong; Deng, Fei

    2014-01-01

    The complete nucleotide sequence of Sucra jujuba nucleopolyhedrovirus (SujuNPV) was determined by 454 pyrosequencing. The SujuNPV genome was 135,952 bp in length with an A+T content of 61.34%. It contained 131 putative open reading frames (ORFs) covering 87.9% of the genome. Among these ORFs, 37 were conserved in all baculovirus genomes that have been completely sequenced, 24 were conserved in lepidopteran baculoviruses, 65 were found in other baculoviruses, and 5 were unique to the SujuNPV genome. Seven homologous regions (hrs) were identified in the SujuNPV genome. SujuNPV contained several genes that were duplicated or copied multiple times: two copies of helicase, DNA binding protein gene (dbp), p26 and cg30, three copies of the inhibitor of the apoptosis gene (iap), and four copies of the baculovirus repeated ORF (bro). Phylogenetic analysis suggested that SujuNPV belongs to a subclade of group II alphabaculovirus, which differs from other baculoviruses in that all nine members of this subclade contain a second copy of dbp. PMID:25329074

  6. Genomic sequencing and analysis of Sucra jujuba nucleopolyhedrovirus.

    Directory of Open Access Journals (Sweden)

    Xiaoping Liu

    Full Text Available The complete nucleotide sequence of Sucra jujuba nucleopolyhedrovirus (SujuNPV was determined by 454 pyrosequencing. The SujuNPV genome was 135,952 bp in length with an A+T content of 61.34%. It contained 131 putative open reading frames (ORFs covering 87.9% of the genome. Among these ORFs, 37 were conserved in all baculovirus genomes that have been completely sequenced, 24 were conserved in lepidopteran baculoviruses, 65 were found in other baculoviruses, and 5 were unique to the SujuNPV genome. Seven homologous regions (hrs were identified in the SujuNPV genome. SujuNPV contained several genes that were duplicated or copied multiple times: two copies of helicase, DNA binding protein gene (dbp, p26 and cg30, three copies of the inhibitor of the apoptosis gene (iap, and four copies of the baculovirus repeated ORF (bro. Phylogenetic analysis suggested that SujuNPV belongs to a subclade of group II alphabaculovirus, which differs from other baculoviruses in that all nine members of this subclade contain a second copy of dbp.

  7. Complete Chloroplast Genome Sequence and Phylogenetic Analysis of Paeonia ostii

    Directory of Open Access Journals (Sweden)

    Shuai Guo

    2018-01-01

    Full Text Available Paeonia ostii, a common oil-tree peony, is important ornamentally and medicinally. However, there are few studies on the chloroplast genome of Paeonia ostii. We sequenced and analyzed the complete chloroplast genome of P. ostii. The size of the P. ostii chloroplast genome is 152,153 bp, including a large single-copy region (85,373 bp, a small single-copy region (17,054 bp, and a pair of inverted repeats regions (24,863 bp. The P. ostii chloroplast genome encodes 111 genes, including 77 protein-coding genes, four ribosomal RNA genes, and 30 transfer RNA genes. The genome contains forward repeats (22, palindromic repeats (28, and tandem repeats (24. The presence of rich simple-sequence repeat loci in the genome provides opportunities for future population genetics work for breeding new varieties. A phylogenetic analysis showed that P. ostii is more closely related to Paeonia delavayi and Paeonia ludlowii than to Paeonia obovata and Paeonia veitchii. The results of this study provide an assembly of the whole chloroplast genome of P. ostii, which may be useful for future breeding and further biological discoveries. It will provide a theoretical basis for the improvement of peony yield and the determination of phylogenetic status.

  8. Multilocus sequence analysis for Leishmania braziliensis outbreak investigation.

    Directory of Open Access Journals (Sweden)

    Mariel A Marlow

    2014-02-01

    Full Text Available With the emergence of leishmaniasis in new regions around the world, molecular epidemiological methods with adequate discriminatory power, reproducibility, high throughput and inter-laboratory comparability are needed for outbreak investigation of this complex parasitic disease. As multilocus sequence analysis (MLSA has been projected as the future gold standard technique for Leishmania species characterization, we propose a MLSA panel of six housekeeping gene loci (6pgd, mpi, icd, hsp70, mdhmt, mdhnc for investigating intraspecific genetic variation of L. (Viannia braziliensis strains and compare the resulting genetic clusters with several epidemiological factors relevant to outbreak investigation. The recent outbreak of cutaneous leishmaniasis caused by L. (V. braziliensis in the southern Brazilian state of Santa Catarina is used to demonstrate the applicability of this technique. Sequenced fragments from six genetic markers from 86 L. (V. braziliensis strains from twelve Brazilian states, including 33 strains from Santa Catarina, were used to determine clonal complexes, genetic structure, and phylogenic networks. Associations between genetic clusters and networks with epidemiological characteristics of patients were investigated. MLSA revealed epidemiological patterns among L. (V. braziliensis strains, even identifying strains from imported cases among the Santa Catarina strains that presented extensive homogeneity. Evidence presented here has demonstrated MLSA possesses adequate discriminatory power for outbreak investigation, as well as other potential uses in the molecular epidemiology of leishmaniasis.

  9. Analysis of chimpanzee history based on genome sequence alignments.

    Directory of Open Access Journals (Sweden)

    Jennifer L Caswell

    2008-04-01

    Full Text Available Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously available. We show that bonobos and common chimpanzees were separated approximately 1,290,000 years ago, western and other common chimpanzees approximately 510,000 years ago, and eastern and central chimpanzees at least 50,000 years ago. We infer that the central chimpanzee population size increased by at least a factor of 4 since its separation from western chimpanzees, while the western chimpanzee effective population size decreased. Surprisingly, in about one percent of the genome, the genetic relationships between humans, chimpanzees, and bonobos appear to be different from the species relationships. We used PCR-based resequencing to confirm 11 regions where chimpanzees and bonobos are not most closely related. Study of such loci should provide information about the period of time 5-7 million years ago when the ancestors of humans separated from those of the chimpanzees.

  10. cDNA, genomic sequence cloning and analysis of the ribosomal ...

    African Journals Online (AJOL)

    Alignment analysis indicated that the nucleotide sequence of the coding sequence showed a high homology with previously reported L37A sequences for Homo sapiens, Pongo abelii, Mus musculus and Bos taurus. The amino acid sequence encoded by the RPL37A gene of giant panda shared a high homology with the ...

  11. Evolution of Mhc Class i Complex Region with Special Reference to Fragmentary Line Sequences

    Science.gov (United States)

    Tateno, Yoshio; Fukami-Kobayashi, Kaoru; Inoko, Hidetoshi

    2008-03-01

    We reviewed the origin and evolution of the two pairs of immune genes, (MHC-B and MHC-C) and (MICA and MICB) in man, chimpanzee and rhesus monkey based mainly on our previous work. Since those genes were well known to have been subject to strong natural selection in evolution, they themselves were not suitable for our study. We thus took another approach to use fragmented and nonfunctional LINEs that had coevolved with the two pairs in the same genomic fragments. Our results showed that MHC-B and MHC-C duplicated about 22 Mry (million years) ago, and MICA and MICB duplicated about 14 Myr ago. Interestingly, rhesus monkey was found not to have either pair but many repeats similar to MHC-B. Therefore, we estimated the divergence time of the monkey, and found that it diverged out from a common ancestor of man and chimpanzee about 30 Myr ago. The divergence time was consistent with the duplication times of the two pairs of immune genes. Based on our results we would predict that orangutan and gorilla also have the two pairs, because the both primate species are considered to have diverged less than 14 Myr ago.

  12. The DNA sequence and analysis of human chromosome 14.

    Science.gov (United States)

    Heilig, Roland; Eckenberg, Ralph; Petit, Jean-Louis; Fonknechten, Núria; Da Silva, Corinne; Cattolico, Laurence; Levy, Michaël; Barbe, Valérie; de Berardinis, Véronique; Ureta-Vidal, Abel; Pelletier, Eric; Vico, Virginie; Anthouard, Véronique; Rowen, Lee; Madan, Anup; Qin, Shizhen; Sun, Hui; Du, Hui; Pepin, Kymberlie; Artiguenave, François; Robert, Catherine; Cruaud, Corinne; Brüls, Thomas; Jaillon, Olivier; Friedlander, Lucie; Samson, Gaelle; Brottier, Philippe; Cure, Susan; Ségurens, Béatrice; Anière, Franck; Samain, Sylvie; Crespeau, Hervé; Abbasi, Nissa; Aiach, Nathalie; Boscus, Didier; Dickhoff, Rachel; Dors, Monica; Dubois, Ivan; Friedman, Cynthia; Gouyvenoux, Michel; James, Rose; Madan, Anuradha; Mairey-Estrada, Barbara; Mangenot, Sophie; Martins, Nathalie; Ménard, Manuela; Oztas, Sophie; Ratcliffe, Amber; Shaffer, Tristan; Trask, Barbara; Vacherie, Benoit; Bellemere, Chadia; Belser, Caroline; Besnard-Gonnet, Marielle; Bartol-Mavel, Delphine; Boutard, Magali; Briez-Silla, Stéphanie; Combette, Stephane; Dufossé-Laurent, Virginie; Ferron, Carolyne; Lechaplais, Christophe; Louesse, Claudine; Muselet, Delphine; Magdelenat, Ghislaine; Pateau, Emilie; Petit, Emmanuelle; Sirvain-Trukniewicz, Peggy; Trybou, Arnaud; Vega-Czarny, Nathalie; Bataille, Elodie; Bluet, Elodie; Bordelais, Isabelle; Dubois, Maria; Dumont, Corinne; Guérin, Thomas; Haffray, Sébastien; Hammadi, Rachid; Muanga, Jacqueline; Pellouin, Virginie; Robert, Dominique; Wunderle, Edith; Gauguet, Gilbert; Roy, Alice; Sainte-Marthe, Laurent; Verdier, Jean; Verdier-Discala, Claude; Hillier, LaDeana; Fulton, Lucinda; McPherson, John; Matsuda, Fumihiko; Wilson, Richard; Scarpelli, Claude; Gyapay, Gábor; Wincker, Patrick; Saurin, William; Quétier, Francis; Waterston, Robert; Hood, Leroy; Weissenbach, Jean

    2003-02-06

    Chromosome 14 is one of five acrocentric chromosomes in the human genome. These chromosomes are characterized by a heterochromatic short arm that contains essentially ribosomal RNA genes, and a euchromatic long arm in which most, if not all, of the protein-coding genes are located. The finished sequence of human chromosome 14 comprises 87,410,661 base pairs, representing 100% of its euchromatic portion, in a single continuous segment covering the entire long arm with no gaps. Two loci of crucial importance for the immune system, as well as more than 60 disease genes, have been localized so far on chromosome 14. We identified 1,050 genes and gene fragments, and 393 pseudogenes. On the basis of comparisons with other vertebrate genomes, we estimate that more than 96% of the chromosome 14 genes have been annotated. From an analysis of the CpG island occurrences, we estimate that 70% of these annotated genes are complete at their 5' end.

  13. BioMatriX: Sequence analysis, structure visualization, phylogenetics ...

    African Journals Online (AJOL)

    Goshi

    2012-04-26

    Apr 26, 2012 ... graphical representations, sequence editing, sequence alignment, restriction enzyme mapping, protein structure visualization, mutation and structure superimposition programs along with phylogenetics tree construction supporting ... converter and other file writing manipulation functions. This extensive tool ...

  14. Genomic fossils reveal adaptation of non-autonomous pararetroviruses driven by concerted evolution of noncoding regulatory sequences.

    Science.gov (United States)

    Chen, Sunlu; Zheng, Huizhen; Kishima, Yuji

    2017-06-01

    The interplay of different virus species in a host cell after infection can affect the adaptation of each virus. Endogenous viral elements, such as endogenous pararetroviruses (PRVs), have arisen from vertical inheritance of viral sequences integrated into host germline genomes. As viral genomic fossils, these sequences can thus serve as valuable paleogenomic data to study the long-term evolutionary dynamics of virus-virus interactions, but they have rarely been applied for this purpose. All extant PRVs have been considered autonomous species in their parasitic life cycle in host cells. Here, we provide evidence for multiple non-autonomous PRV species with structural defects in viral activity that have frequently infected ancient grass hosts and adapted through interplay between viruses. Our paleogenomic analyses using endogenous PRVs in grass genomes revealed that these non-autonomous PRV species have participated in interplay with autonomous PRVs in a possible commensal partnership, or, alternatively, with one another in a possible mutualistic partnership. These partnerships, which have been established by the sharing of noncoding regulatory sequences (NRSs) in intergenic regions between two partner viruses, have been further maintained and altered by the sequence homogenization of NRSs between partners. Strikingly, we found that frequent region-specific recombination, rather than mutation selection, is the main causative mechanism of NRS homogenization. Our results, obtained from ancient DNA records of viruses, suggest that adaptation of PRVs has occurred by concerted evolution of NRSs between different virus species in the same host. Our findings further imply that evaluation of within-host NRS interactions within and between populations of viral pathogens may be important.

  15. Genomic fossils reveal adaptation of non-autonomous pararetroviruses driven by concerted evolution of noncoding regulatory sequences.

    Directory of Open Access Journals (Sweden)

    Sunlu Chen

    2017-06-01

    Full Text Available The interplay of different virus species in a host cell after infection can affect the adaptation of each virus. Endogenous viral elements, such as endogenous pararetroviruses (PRVs, have arisen from vertical inheritance of viral sequences integrated into host germline genomes. As viral genomic fossils, these sequences can thus serve as valuable paleogenomic data to study the long-term evolutionary dynamics of virus-virus interactions, but they have rarely been applied for this purpose. All extant PRVs have been considered autonomous species in their parasitic life cycle in host cells. Here, we provide evidence for multiple non-autonomous PRV species with structural defects in viral activity that have frequently infected ancient grass hosts and adapted through interplay between viruses. Our paleogenomic analyses using endogenous PRVs in grass genomes revealed that these non-autonomous PRV species have participated in interplay with autonomous PRVs in a possible commensal partnership, or, alternatively, with one another in a possible mutualistic partnership. These partnerships, which have been established by the sharing of noncoding regulatory sequences (NRSs in intergenic regions between two partner viruses, have been further maintained and altered by the sequence homogenization of NRSs between partners. Strikingly, we found that frequent region-specific recombination, rather than mutation selection, is the main causative mechanism of NRS homogenization. Our results, obtained from ancient DNA records of viruses, suggest that adaptation of PRVs has occurred by concerted evolution of NRSs between different virus species in the same host. Our findings further imply that evaluation of within-host NRS interactions within and between populations of viral pathogens may be important.

  16. Dynamic evolution of pathogenicity revealed by sequencing and comparative genomics of 19 Pseudomonas syringae isolates.

    Directory of Open Access Journals (Sweden)

    David A Baltrus

    2011-07-01

    Full Text Available Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species.

  17. Dynamic evolution of pathogenicity revealed by sequencing and comparative genomics of 19 Pseudomonas syringae isolates.

    Science.gov (United States)

    Baltrus, David A; Nishimura, Marc T; Romanchuk, Artur; Chang, Jeff H; Mukhtar, M Shahid; Cherkis, Karen; Roach, Jeff; Grant, Sarah R; Jones, Corbin D; Dangl, Jeffery L

    2011-07-01

    Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs) are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species. © 2011 Baltrus et al.

  18. Accident Sequence Evaluation Program: Human reliability analysis procedure

    Energy Technology Data Exchange (ETDEWEB)

    Swain, A.D.

    1987-02-01

    This document presents a shortened version of the procedure, models, and data for human reliability analysis (HRA) which are presented in the Handbook of Human Reliability Analysis With emphasis on Nuclear Power Plant Applications (NUREG/CR-1278, August 1983). This shortened version was prepared and tried out as part of the Accident Sequence Evaluation Program (ASEP) funded by the US Nuclear Regulatory Commission and managed by Sandia National Laboratories. The intent of this new HRA procedure, called the ''ASEP HRA Procedure,'' is to enable systems analysts, with minimal support from experts in human reliability analysis, to make estimates of human error probabilities and other human performance characteristics which are sufficiently accurate for many probabilistic risk assessments. The ASEP HRA Procedure consists of a Pre-Accident Screening HRA, a Pre-Accident Nominal HRA, a Post-Accident Screening HRA, and a Post-Accident Nominal HRA. The procedure in this document includes changes made after tryout and evaluation of the procedure in four nuclear power plants by four different systems analysts and related personnel, including human reliability specialists. The changes consist of some additional explanatory material (including examples), and more detailed definitions of some of the terms. 42 refs.

  19. PRE-MAIN SEQUENCE EVOLUTIONS OF SOLAR ABUNDANCE LOW MASS STARS

    Directory of Open Access Journals (Sweden)

    Youn Kil Jung

    2007-03-01

    Full Text Available We present the Pre-Main Sequence (PMS evolutionary tracks of stars with 0.065~5.0M_⨀. The models were evolved from the PMS stellar birthline to the onset of hydrogen burning in the core. The convective turnover timescales which enables an observational test of theoretical model, particulary in the stellar dynamic activity, are also calculated. All models have Sun-like metal abundance, typically considered as the stars in the Galactic disk and the star formation region of Population I star. The convection phenomenon is treated by the usual mixing length approximation. All evolutionary tracks are available upon request.

  20. Sequence analysis of the medium RNA segment of three Simbu serogroup viruses, Akabane, Aino, and Peaton viruses.

    Science.gov (United States)

    Yanase, Tohru; Yoshida, Kazuo; Ohashi, Seiichi; Kato, Tomoko; Tsuda, Tomoyuki

    2003-05-01

    The sequence analysis was carried out for the medium (M) RNA segment of the Akabane virus (AKAV), Aino virus (AINV), and Peaton virus (PEAV) of the Simbu serogroup of the genus Orthobunyavirus of the family Bunyaviridae. The complementary sequences of the M RNA segments of AKAV, AINV, and PEAV contain a single large open reading frame (ORF), like other orthobunyaviruses. The ORFs potentially encode 1401 amino acids (aa), 1404 aa, and 1400 aa polypeptides, respectively. The identity of the M segment among these viruses is remarkably low, although previous researchers reported that the small RNA segments are highly conserved. Because the M segment codes for the viral surface glycoproteins G1 and G2, the variability of the M segment may affect the antigenicity of these viruses. Phylogenetic studies based on the M and S segment sequences suggested that genetic reassortment has been occurring among ancestral viruses of the three Simbu serogroup viruses throughout their evolution.

  1. STRUCTURE ANALYSIS OF THE EVOLUTION OF PRIVATE CONSUMPTION IN ROMANIA

    Directory of Open Access Journals (Sweden)

    Raluca M. BĂLĂ

    2014-06-01

    Full Text Available This paper aims to analyze the evolution of the private consumption structure in Romania in the last twenty years surprising three main periods that influenced the composition of economic welfare of romanian citizens: the transition period to the market economy after the fall of communist regime, the period of economic stabilization and sustained growth and the period of financial and economic crisis. The analysis reveals the modifications in the structure of private consumption throughout the three main phases surprised in the Romanian economy and shows the influence of these changes on the economic welfare of the population.

  2. Secondary structure-based analysis of mouse brain small RNA sequences obtained by using next-generation sequencing.

    Science.gov (United States)

    Kiyosawa, Hidenori; Okumura, Akio; Okui, Saya; Ushida, Chisato; Kawai, Gota

    2015-08-01

    In order to find novel structured small RNAs, next-generation sequencing was applied to small RNA fractions with lengths ranging from 40 to 140 nt and secondary structure-based clustering was performed. Sequences of structured RNAs were effectively clustered and analyzed by secondary structure. Although more than 99% of the obtained sequences were known RNAs, 16 candidate mouse structured small non-coding RNAs (MsncRs) were isolated. Based on these results, the merits of secondary structure-based analysis are discussed. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Analysis of expressed sequence tags derived from inflorescence ...

    African Journals Online (AJOL)

    From all the sequences analysed, only 186 (32.8%) sequences were given the GO numbers and grouped into the three GO main categories namely biological process, cellular component and molecular function. Several important ESTs were highlighted based on their functional categories. There were five sequences ...

  4. Arguments for a Cluster Analysis of Nasal Consonant Sequences of ...

    African Journals Online (AJOL)

    Bantu language scholars, have among other things, debated over the issue of whether nasal and consonant sequences (NC sequences) in various Bantu languages should be considered as clusters or single segments (prenasalised stops). This paper examines these sequences as they occur in Sukwa nouns. Sukwa is a ...

  5. Evolution of Analysis of Polyhenols from Grapes, Wines, and Extracts

    Directory of Open Access Journals (Sweden)

    Pierre-Louis Teissedre

    2013-01-01

    Full Text Available Grape and wine phenolics are structurally diverse, from simple molecules to oligomers and polymers usually designated as tannins. They have an important impact on the organoleptic properties of wines, that’s why their analysis and quantification are of primordial importance. The extraction of phenolics from grapes and from wines is the first step involved in the analysis. Then, several analytical methods have been developed for the determination of total content of phenolic, while chromatographic and spectrophotometric analyses are continuously improved in order to achieve adequate separation of phenolic molecules, their subsequent identification and quantification. This review provides a summary of evolution of analysis of polyphenols from grapes, wines and extracts.

  6. Cloning, sequencing, and sequence analysis of two novel plasmids from the thermophilic anaerobic bacterium Anaerocellum thermophilum

    DEFF Research Database (Denmark)

    Clausen, Anders; Mikkelsen, Marie Just; Schrøder, I.

    2004-01-01

    The nucleotide sequence of two novel plasmids isolated from the extreme thermophilic anaerobic bacterium Anaerocellum thermophilum DSM6725 (A. thermophilum), growing optimally at 70degreesC, has been determined. pBAS2 was found to be a 3653 bp plasmid with a GC content of 43%, and the sequence...... revealed 10 open reading frames (ORFs). The two largest of these, namely Orf21 and Orf41, showed similarity to a Bacillus plasmid recombinase and a Pseudoalteromonas plasmid replication protein, respectively. A sequence with homology to double stranded replication origins from rolling circle plasmids...

  7. Whole genome sequencing of bacteria in cystic fibrosis as a model for bacterial genome adaptation and evolution.

    Science.gov (United States)

    Sharma, Poonam; Gupta, Sushim Kumar; Rolain, Jean-Marc

    2014-03-01

    Cystic fibrosis (CF) airways harbor a wide variety of new and/or emerging multidrug resistant bacteria which impose a heavy burden on patients. These bacteria live in close proximity with one another, which increases the frequency of lateral gene transfer. The exchange and movement of mobile genetic elements and genomic islands facilitate the spread of genes between genetically diverse bacteria, which seem to be advantageous to the bacterium as it allows adaptation to the new niches of the CF lungs. Niche adaptation is one of the major evolutionary forces shaping bacterial genome composition and in CF the chronic strains adapt and become less virulent. The purpose of this review is to shed light on CF bacterial genome alterations. Next-generation sequencing technology is an exciting tool that may help us to decipher the genome architecture and the evolution of bacteria colonizing CF lungs.

  8. Whole-genome sequencing of Oryza brachyantha reveals mechanisms underlying Oryza genome evolution.

    Science.gov (United States)

    Chen, Jinfeng; Huang, Quanfei; Gao, Dongying; Wang, Junyi; Lang, Yongshan; Liu, Tieyan; Li, Bo; Bai, Zetao; Luis Goicoechea, Jose; Liang, Chengzhi; Chen, Chengbin; Zhang, Wenli; Sun, Shouhong; Liao, Yi; Zhang, Xuemei; Yang, Lu; Song, Chengli; Wang, Meijiao; Shi, Jinfeng; Liu, Geng; Liu, Junjie; Zhou, Heling; Zhou, Weili; Yu, Qiulin; An, Na; Chen, Yan; Cai, Qingle; Wang, Bo; Liu, Binghang; Min, Jiumeng; Huang, Ying; Wu, Honglong; Li, Zhenyu; Zhang, Yong; Yin, Ye; Song, Wenqin; Jiang, Jiming; Jackson, Scott A; Wing, Rod A; Wang, Jun; Chen, Mingsheng

    2013-01-01

    The wild species of the genus Oryza contain a largely untapped reservoir of agronomically important genes for rice improvement. Here we report the 261-Mb de novo assembled genome sequence of Oryza brachyantha. Low activity of long-terminal repeat retrotransposons and massive internal deletions of ancient long-terminal repeat elements lead to the compact genome of Oryza brachyantha. We model 32,038 protein-coding genes in the Oryza brachyantha genome, of which only 70% are located in collinear positions in comparison with the rice genome. Analysing breakpoints of non-collinear genes suggests that double-strand break repair through non-homologous end joining has an important role in gene movement and erosion of collinearity in the Oryza genomes. Transition of euchromatin to heterochromatin in the rice genome is accompanied by segmental and tandem duplications, further expanded by transposable element insertions. The high-quality reference genome sequence of Oryza brachyantha provides an important resource for functional and evolutionary studies in the genus Oryza.

  9. Evolution of Mexican Bursera (Burseraceae) inferred from ITS, ETS, and 5S nuclear ribosomal DNA sequences.

    Science.gov (United States)

    Becerra, Judith X

    2003-02-01

    I reconstructed a phylogeny of 66 species and varieties of Bursera and 9 outgroup species using sequences of the internal transcribed spacer region (ITS), the 5S non-transcribed region (5S-NTS), and the external transcribed region (ETS) of nuclear ribosomal DNA. This study extends a previously proposed parsimony-based phylogenetic study that used the ITS sequences of 57 Bursera species and five outgroups. Parsimony and maximum likelihood methods were used to infer the phylogeny in this new study. Analyses of the combined data sets largely confirmed the phylogenetic relationships proposed by the previous molecular study but generated a considerably more robust topology. The new phylogenies corroborate the monophyly of the genus, and its division into the two monophyletic subgenera or sections, Bursera and Bullockia. The current analyses also identify four main groups of species in section Bursera, and two in section Bullockia, confirming some of the previously proposed groups based on fruit, flower, and leaf morphology. One previously problematic species B. sarcopoda, which has sometimes been placed in Commiphora, is shown to belong in Bursera. Another controversial species, Commiphora leptophloeos, which was thought to belong to Bursera, falls within Commiphora.

  10. Searching for convergent evolution in manganese superoxidase dismutase using hydrophobic cluster analysis

    Directory of Open Access Journals (Sweden)

    Heng Xiang

    2014-06-01

    Full Text Available There are numerous examples of convergent evolution in nature. Major ecological adaptations such as flight, loss of limbs in vertebrates, pesticide resistance, adaptation to a parasitic way of life, etc., have all evolved more than once, as seen by their analogous functions in separate taxa. But what about protein evolution? Does the environment have a strong enough influence on intracellular processes that enzymes and other functional proteins play, to evolve similar functional roles separately in different organisms? Manganese Superoxide Dismutase (MnSOD is a manganesedependant metallo-enzyme which plays a crucial role in protecting cells from anti-oxidative stress by eliminating reactive (superoxide oxygen species. It is a ubiquitous housekeeping enzyme found in nearly all organisms. In this study we compare phylogenies based on MnSOD protein sequences to those based on scores from Hydrophobic Cluster Analysis (HCA. We calculated HCA similarity values for each pair of taxa to obtain a pair-wise distance matrix. A UPGMA tree based on the HCA distance matrix and a common tree based on the primary protein sequence for MnSOD was constructed. Differences between these two trees within animals, enterobacteriaceae, planctomycetes and cyanobacteria are presented and cited as possible examples of convergence. We note that several residue changes result in changes in hydrophobicity at positions which apparently are under the effect of positive selection.

  11. Sequence analysis and over-expression of ribosomal protein S28 ...

    African Journals Online (AJOL)

    Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Bos Taurus, Mus musculus, Rattus norvegicus and Sus scrofa (92.4, 92.4, 87.1, 86.7 and 89.5%, respectively) as determined by Blast analysis.. The amino acid sequence encoded by ...

  12. Whole Genome Sequencing of the Asian Arowana (Scleropages formosus) Provides Insights into the Evolution of Ray-Finned Fishes.

    Science.gov (United States)

    Austin, Christopher M; Tan, Mun Hua; Croft, Larry J; Hammer, Michael P; Gan, Han Ming

    2015-10-06

    The Asian arowana (Scleropages formosus) is of commercial importance, conservation concern, and is a representative of one of the oldest lineages of ray-finned fish, the Osteoglossomorpha. To add to genomic knowledge of this species and the evolution of teleosts, the genome of a Malaysian specimen of arowana was sequenced. A draft genome is presented consisting of 42,110 scaffolds with a total size of 708 Mb (2.85% gaps) representing 93.95% of core eukaryotic genes. Using a k-mer-based method, a genome size of 900 Mb was also estimated. We present an update on the phylogenomics of fishes based on a total of 27 species (23 fish species and 4 tetrapods) using 177 orthologous proteins (71,360 amino acid sites), which supports established relationships except that arowana is placed as the sister lineage to all teleost clades (Bayesian posterior probability 1.00, bootstrap replicate 93%), that evolved after the teleost genome duplication event rather than the eels (Elopomorpha). Evolutionary rates are highly heterogeneous across the tree with fishes represented by both slowly and rapidly evolving lineages. A total of 94 putative pigment genes were identified, providing the impetus for development of molecular markers associated with the spectacular colored phenotypes found within this species. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Molecular systematics and evolution of the "Apollo" butterflies of the genus Parnassius (Lepidoptera: Papilionidae) based on mitochondrial DNA sequence data.

    Science.gov (United States)

    Omoto, Keiichi; Katoh, Toru; Chichvarkhin, Anton; Yagi, Takashi

    2004-02-04

    Sequences of 777 bp of mtDNA-ND5 locus were determined in order to shed light on the molecular systematics and evolution of the "Apollo" butterflies. Examined were nearly all of about 50 species of the genus Parnassius, together with seven species of the allied genera in the subfamily Parnassiinae (Papilionidae). The NJ and the MP phylogenetic trees show that the "Apollos" constitute a monophyletic group, comprising a number of cluster groups probably reflecting a relatively rapid radiation in evolution. The clusters of species-groups denoted I-VIII correspond to those species-groups recognized on the basis of morphological characters. Our findings will also help understand the biological relationships among several species or subspecies on which the classical taxonomy is in dispute. The unexpected finding is that among the samples of allied genera compared, Hypermnestra helios appears to be the most closely related to the "Apollos", despite morphological and behavioral dissimilarity. Furthermore, in contrast to the previous higher taxonomy, Archon apollinus which is classified in the tribe Parnassiini was found genetically closer to the tribe Zerynthiini, raising a taxonomic controversy.

  14. Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications

    OpenAIRE

    Ebhardt, H. Alexander; Tsang, Herbert H.; Dai, Denny C.; Liu, Yifeng; Bostan, Babak; Fahlman, Richard P.

    2009-01-01

    Recent advances in DNA-sequencing technology have made it possible to obtain large datasets of small RNA sequences. Here we demonstrate that not all non-perfectly matched small RNA sequences are simple technological sequencing errors, but many hold valuable biological information. Analysis of three small RNA datasets originating from Oryza sativa and Arabidopsis thaliana small RNA-sequencing projects demonstrates that many single nucleotide substitution errors overlap when aligning homologous...

  15. Multi-gene analysis of Symbiodinium dinoflagellates: a perspective on rarity, symbiosis, and evolution.

    Science.gov (United States)

    Pochon, Xavier; Putnam, Hollie M; Gates, Ruth D

    2014-01-01

    Symbiodinium, a large group of dinoflagellates, live in symbiosis with marine protists, invertebrate metazoans, and free-living in the environment. Symbiodinium are functionally variable and play critical energetic roles in symbiosis. Our knowledge of Symbiodinium has been historically constrained by the limited number of molecular markers available to study evolution in the genus. Here we compare six functional genes, representing three cellular compartments, in the nine known Symbiodinium lineages. Despite striking similarities among the single gene phylogenies from distinct organelles, none were evolutionarily identical. A fully concatenated reconstruction, however, yielded a well-resolved topology identical to the current benchmark nr28S gene. Evolutionary rates differed among cellular compartments and clades, a pattern largely driven by higher rates of evolution in the chloroplast genes of Symbiodinium clades D2 and I. The rapid rates of evolution observed amongst these relatively uncommon Symbiodinium lineages in the functionally critical chloroplast may translate into potential innovation for the symbiosis. The multi-gene analysis highlights the potential power of assessing genome-wide evolutionary patterns using recent advances in sequencing technology and emphasizes the importance of integrating ecological data with more comprehensive sampling of free-living and symbiotic Symbiodinium in assessing the evolutionary adaptation of this enigmatic dinoflagellate.

  16. Evolution

    Science.gov (United States)

    Peter, Ulmschneider

    When we are looking for intelligent life outside the Earth, there is a fundamental question: Assuming that life has formed on an extraterrestrial planet, will it also develop toward intelligence? As this is hotly debated, we will now describe the development of life on Earth in more detail in order to show that there are good reasons why evolution should culminate in intelligent beings.

  17. From sequence and forces to structure, function, and evolution of intrinsically disordered proteins.

    Science.gov (United States)

    Forman-Kay, Julie D; Mittag, Tanja

    2013-09-03

    Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics, and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales, and compactness are shaping a unified understanding of structure-dynamics-disorder/function relationships. In the 20(th) anniversary of Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional, and evolutionary properties. Copyright © 2013 Elsevier Ltd. All rights reserved.

  18. Spatiotemporal evolution of the 2011 Prague, Oklahoma, aftershock sequence revealed using subspace detection and relocation

    Science.gov (United States)

    McMahon, Nicole D.; Aster, Richard C.; Yeck, William L.; McNamara, Daniel E.; Benz, Harley M.

    2017-07-01

    The 6 November 2011 Mw 5.7 earthquake near Prague, Oklahoma, is the second largest earthquake ever recorded in the state. A Mw 4.8 foreshock and the Mw 5.7 mainshock triggered a prolific aftershock sequence. Utilizing a subspace detection method, we increase by fivefold the number of precisely located events between 4 November and 5 December 2011. We find that while most aftershock energy is released in the crystalline basement, a significant number of the events occur in the overlying Arbuckle Group, indicating that active Meeker-Prague faulting extends into the sedimentary zone of wastewater disposal. Although the number of aftershocks in the Arbuckle Group is large, comprising 40% of the aftershock catalog, the moment contribution of Arbuckle Group earthquakes is much less than 1% of the total aftershock moment budget. Aftershock locations are sparse in patches that experienced large slip during the mainshock.

  19. Genome analysis of the platypus reveals unique signatures of evolution

    Science.gov (United States)

    Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.

    2009-01-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734

  20. Longitudinal sequencing of HIV-1 infected patients with low-level viremia for years while on ART shows no indications for genetic evolution of the virus.

    Science.gov (United States)

    Vancoillie, Leen; Hebberecht, Laura; Dauwe, Kenny; Demecheleer, Els; Dinakis, Sylvie; Vaneechoutte, Dries; Mortier, Virginie; Verhofstede, Chris

    2017-10-01

    HIV-infected patients on antiretroviral therapy (ART) may present low-level viremia (LLV) above the detection level of current viral load assays. In many cases LLV is persistent but does not result in overt treatment failure or selection of drug resistant viral variants. To elucidate whether LLV reflects active virus replication, we extensively sequenced pol and env genes of the viral populations present before and during LLV in 18 patients and searched for indications of genetic evolution. Maximum likelihood phylogenetic trees were inspected for temporal structure both visually and by linear regression analysis of root-to-tip and pairwise distances. Viral coreceptor tropism was assessed at different time points before and during LLV. In none of the patients consistent indications for genetic evolution were found over a median period of 4.8 years of LLV. As such these findings could not provide evidence that active virus replication is the main driver of LLV. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Evolution and comparative genomics of subcellular specializations: EST sequencing of Torpedo electric organ.

    Science.gov (United States)

    Nazarian, Javad; Berry, Deborah L; Sanjari, Salar; Razvi, Mohammed; Brown, Kristy; Hathout, Yetrib; Vertes, Akos; Dadgar, Sherry; Hoffman, Eric P

    2011-03-01

    Uncharacterized open reading frames (ORFs) in human genomic sequence often show a high degree of evolutionary conservation, yet have little or no tissue EST or protein data suggestive of protein product function. The encoded proteins may have highly restricted expression in specialized cells, subcellular specializations, and/or narrow windows during development. One such highly specialized and minute subcellular compartment is the neuromuscular junction (NMJ), where motorneurons contact muscle fibers. The electric Torpedo ray has evolved to expand the NMJ structure to the size of a large organ (electroplax organ), and we hypothesized that Torpedo electroplax proteins would be candidates for human ESTs expressed at the human NMJ. A total of 9719 primary electroplax cDNA clones were sequenced. We identified 44 human ORFs showing high (>63%) amino acid identity to Torpedo electroplax transcripts with enrichment for mRNA splicing motifs (SH2 and pre-mRNA splicing domains), an observation potentially important for the strict nuclear domains maintained by myonuclei underlying the NMJ. We generated antibodies against two uncharacterized human genes (C19orf29 [Drosophila cactin] and C15orf24) and showed that these were indeed expressed at the murine NMJ. Cactin, a member of the Rel transcription factor family in Drosophila, localized to the postsynaptic cytosol of the NMJ and nuclear membrane. C15orf24 protein localized to the murine postsynaptic sarcolemma. We show a novel approach towards identifying proteins expressed at a subcellular specialization using evolutionary diversity of organ function and cross-species mapping. Copyright © 2010 Elsevier B.V. All rights reserved.

  2. Frame sequences analysis technique of linear objects movement

    Science.gov (United States)

    Oshchepkova, V. Y.; Berg, I. A.; Shchepkin, D. V.; Kopylova, G. V.

    2017-12-01

    Obtaining data by noninvasive methods are often needed in many fields of science and engineering. This is achieved through video recording in various frame rate and light spectra. In doing so quantitative analysis of movement of the objects being studied becomes an important component of the research. This work discusses analysis of motion of linear objects on the two-dimensional plane. The complexity of this problem increases when the frame contains numerous objects whose images may overlap. This study uses a sequence containing 30 frames at the resolution of 62 × 62 pixels and frame rate of 2 Hz. It was required to determine the average velocity of objects motion. This velocity was found as an average velocity for 8-12 objects with the error of 15%. After processing dependencies of the average velocity vs. control parameters were found. The processing was performed in the software environment GMimPro with the subsequent approximation of the data obtained using the Hill equation.

  3. Whole genome sequencing of the fish pathogen Francisella noatunensis subsp. orientalis Toba04 gives novel insights into Francisella evolution and pathogenecity.

    Science.gov (United States)

    Sridhar, Settu; Sharma, Animesh; Kongshaug, Heidi; Nilsen, Frank; Jonassen, Inge

    2012-11-06

    Francisella is a genus of gram-negative bacterium highly virulent in fishes and human where F. tularensis is causing the serious disease tularaemia in human. Recently Francisella species have been reported to cause mortality in aquaculture species like Atlantic cod and tilapia. We have completed the sequencing and draft assembly of the Francisella noatunensis subsp. orientalisToba04 strain isolated from farmed Tilapia. Compared to other available Francisella genomes, it is most similar to the genome of Francisella philomiragia subsp. philomiragia, a free-living bacterium not virulent to human. The genome is rearranged compared to the available Francisella genomes even though we found no IS-elements in the genome. Nearly 16% percent of the predicted ORFs are pseudogenes. Computational pathway analysis indicates that a number of the metabolic pathways are disrupted due to pseudogenes. Comparing the novel genome with other available Francisella genomes, we found around 2.5% of unique genes present in Francisella noatunensis subsp. orientalis Toba04 and a list of genes uniquely present in the human-pathogenic Francisella subspecies. Most of these genes might have transferred from bacterial species through horizontal gene transfer. Comparative analysis between human and fish pathogen also provide insights into genes responsible for pathogenecity. Our analysis of pseudogenes indicates that the evolution of Francisella subspecies's pseudogenes from Tilapia is old with large number of pseudogenes having more than one inactivating mutation. The fish pathogen has lost non-essential genes some time ago. Evolutionary analysis of the Francisella genomes, strongly suggests that human and fish pathogenic Francisella species have evolved independently from free-living metabolically competent Francisella species. These findings will contribute to understanding the evolution of Francisella species and pathogenesis.

  4. Whole genome sequencing of the fish pathogen Francisella noatunensis subsp. orientalis Toba04 gives novel insights into Francisella evolution and pathogenecity

    Directory of Open Access Journals (Sweden)

    Sridhar Settu

    2012-11-01

    Full Text Available Abstract Background Francisella is a genus of gram-negative bacterium highly virulent in fishes and human where F. tularensis is causing the serious disease tularaemia in human. Recently Francisella species have been reported to cause mortality in aquaculture species like Atlantic cod and tilapia. We have completed the sequencing and draft assembly of the Francisella noatunensis subsp. orientalisToba04 strain isolated from farmed Tilapia. Compared to other available Francisella genomes, it is most similar to the genome of Francisella philomiragia subsp. philomiragia, a free-living bacterium not virulent to human. Results The genome is rearranged compared to the available Francisella genomes even though we found no IS-elements in the genome. Nearly 16% percent of the predicted ORFs are pseudogenes. Computational pathway analysis indicates that a number of the metabolic pathways are disrupted due to pseudogenes. Comparing the novel genome with other available Francisella genomes, we found around 2.5% of unique genes present in Francisella noatunensis subsp. orientalis Toba04 and a list of genes uniquely present in the human-pathogenic Francisella subspecies. Most of these genes might have transferred from bacterial species through horizontal gene transfer. Comparative analysis between human and fish pathogen also provide insights into genes responsible for pathogenecity. Our analysis of pseudogenes indicates that the evolution of Francisella subspecies’s pseudogenes from Tilapia is old with large number of pseudogenes having more than one inactivating mutation. Conclusions The fish pathogen has lost non-essential genes some time ago. Evolutionary analysis of the Francisella genomes, strongly suggests that human and fish pathogenic Francisella species have evolved independently from free-living metabolically competent Francisella species. These findings will contribute to understanding the evolution of Francisella species and pathogenesis.

  5. Microeconomic co-evolution model for financial technical analysis signals

    Science.gov (United States)

    Rotundo, G.; Ausloos, M.

    2007-01-01

    Technical analysis (TA) has been used for a long time before the availability of more sophisticated instruments for financial forecasting in order to suggest decisions on the basis of the occurrence of data patterns. Many mathematical and statistical tools for quantitative analysis of financial markets have experienced a fast and wide growth and have the power for overcoming classical TA methods. This paper aims to give a measure of the reliability of some information used in TA by exploring the probability of their occurrence within a particular microeconomic agent-based model of markets, i.e., the co-evolution Bak-Sneppen model originally invented for describing species population evolutions. After having proved the practical interest of such a model in describing financial index so-called avalanches, in the prebursting bubble time rise, the attention focuses on the occurrence of trend line detection crossing of meaningful barriers, those that give rise to some usual TA strategies. The case of the NASDAQ crash of April 2000 serves as an illustration.

  6. A Novel Approach to Detect Malware Based on API Call Sequence Analysis

    National Research Council Canada - National Science Library

    Ki, Youngjoon; Kim, Eunjin; Kim, Huy Kang

    2015-01-01

    .... In this paper, we propose a novel approach for dynamic analysis of malware. We adopt DNA sequence alignment algorithms and extract common API call sequence patterns of malicious function from malware in different categories...

  7. Reconstructing the evolution of agarics from nuclear gene sequences and basidiospore ultrastructure.

    Science.gov (United States)

    Garnica, Sigisfredo; Weiss, Michael; Walther, Grit; Oberwinkler, Franz

    2007-09-01

    Traditional classifications of agaric fungi involve gross morphology of their fruit bodies and meiospore print-colour. However, the phylogeny of these fungi and the evolution of their morphological and ecological traits are poorly understood. Phylogenetic analyses have already demonstrated that characters used in traditional classifications of basidiomycetes may be heavily affected by homoplasy, and that non-gilled taxa evolved within the agarics several times. By integrating molecular phylogenetic analyses including domains D1-D3 and D7-D8 of nucLSU rDNA and domains A-C of the RPB1 gene with morphological and chemical data from representative species of 88 genera, we were able to resolve higher groups of agarics. We found that the species with thick-walled and pigmented basidiospores constitute a derived group, and hypothesize that this specific combination of characters represents an evolutionary advantage by increasing the tolerance of the basidiospores to dehydration and solar radiation and so opened up new ecological niches, e.g. the colonization of dung substrates by enabling basidiospores to survive gut passages through herbivores. Our results confirm the validity of basidiospore morphology as a phylogenetic marker in the agarics.

  8. Sequence and expression variations suggest an adaptive role for the DA1-like gene family in the evolution of soybeans.

    Science.gov (United States)

    Zhao, Man; Gu, Yongzhe; He, Lingli; Chen, Qingshan; He, Chaoying

    2015-05-15

    The DA1 gene family is plant-specific and Arabidopsis DA1 regulates seed and organ size, but the functions in soybeans are unknown. The cultivated soybean (Glycine max) is believed to be domesticated from the annual wild soybeans (Glycine soja). To evaluate whether DA1-like genes were involved in the evolution of soybeans, we compared variation at both sequence and expression levels of DA1-like genes from G. max (GmaDA1) and G. soja (GsoDA1). Sequence identities were extremely high between the orthologous pairs between soybeans, while the paralogous copies in a soybean species showed a relatively high divergence. Moreover, the expression variation of DA1-like paralogous genes in soybean was much greater than the orthologous gene pairs between the wild and cultivated soybeans during development and challenging abiotic stresses such as salinity. We further found that overexpressing GsoDA1 genes did not affect seed size. Nevertheless, overexpressing them reduced transgenic Arabidopsis seed germination sensitivity to salt stress. Moreover, most of these genes could improve salt tolerance of the transgenic Arabidopsis plants, corroborated by a detection of expression variation of several key genes in the salt-tolerance pathways. Our work suggested that expression diversification of DA1-like genes is functionally associated with adaptive radiation of soybeans, reinforcing that the plant-specific DA1 gene family might have contributed to the successful adaption to complex environments and radiation of the plants.

  9. Sequencing and Analysis of Globally Obtained Human Respiratory Syncytial Virus A and B Genomes

    Science.gov (United States)

    Bose, Michael E.; He, Jie; Shrivastava, Susmita; Nelson, Martha I.; Bera, Jayati; Halpin, Rebecca A.; Town, Christopher D.; Lorenzi, Hernan A.; Noyola, Daniel E.; Falcone, Valeria; Gerna, Giuseppe; De Beenhouwer, Hans; Videla, Cristina; Kok, Tuckweng; Venter, Marietjie; Williams, John V.; Henrickson, Kelly J.

    2015-01-01

    Background Human respiratory syncytial virus (RSV) is the leading cause of respiratory tract infections in children globally, with nearly all children experiencing at least one infection by the age of two. Partial sequencing of the attachment glycoprotein gene is conducted routinely for genotyping, but relatively few whole genome sequences are available for RSV. The goal of our study was to sequence the genomes of RSV strains collected from multiple countries to further understand the global diversity of RSV at a whole-genome level. Methods We collected RSV samples and isolates from Mexico, Argentina, Belgium, Italy, Germany, Australia, South Africa, and the USA from the years 1998-2010. Both Sanger and next-generation sequencing with the Illumina and 454 platforms were used to sequence the whole genomes of RSV A and B. Phylogenetic analyses were performed using the Bayesian and maximum likelihood methods of phylogenetic inference. Results We sequenced the genomes of 34 RSVA and 23 RSVB viruses. Phylogenetic analysis showed that the RSVA genome evolves at an estimated rate of 6.72 × 10-4 substitutions/site/year (95% HPD 5.61 × 10-4 to 7.6 × 10-4) and for RSVB the evolutionary rate was 7.69 × 10-4 substitutions/site/year (95% HPD 6.81 × 10-4 to 8.62 × 10-4). We found multiple clades co-circulating globally for both RSV A and B. The predominant clades were GA2 and GA5 for RSVA and BA for RSVB. Conclusions Our analyses showed that RSV circulates on a global scale with the same predominant clades of viruses being found in countries around the world. However, the distribution of clades can change rapidly as new strains emerge. We did not observe a strong spatial structure in our trees, with the same three main clades of RSV co-circulating globally, suggesting that the evolution of RSV is not strongly regionalized. PMID:25793751

  10. Patterns of sequence divergence and evolution of the S orthologous regions between Asian and African cultivated rice species.

    Directory of Open Access Journals (Sweden)

    Romain Guyot

    Full Text Available A strong postzygotic reproductive barrier separates the recently diverged Asian and African cultivated rice species, Oryza sativa and O. glaberrima. Recently a model of genetic incompatibilities between three adjacent loci: S(1A, S(1 and S(1B (called together the S(1 regions interacting epistatically, was postulated to cause the allelic elimination of female gametes in interspecific hybrids. Two candidate factors for the S(1 locus (including a putative F-box gene were proposed, but candidates for S(1A and S(1B remained undetermined. Here, to better understand the basis of the evolution of regions involved in reproductive isolation, we studied the genic and structural changes accumulated in the S(1 regions between orthologous sequences. First, we established an 813 kb genomic sequence in O. glaberrima, covering completely the S(1A, S(1 and the majority of the S(1B regions, and compared it with the orthologous regions of O. sativa. An overall strong structural conservation was observed, with the exception of three isolated regions of disturbed collinearity: (1 a local invasion of transposable elements around a putative F-box gene within S(1, (2 the multiple duplication and subsequent divergence of the same F-box gene within S(1A, (3 an interspecific chromosomal inversion in S(1B, which restricts recombination in our O. sativa×O. glaberrima crosses. Beside these few structural variations, a uniform conservative pattern of coding sequence divergence was found all along the S(1 regions. Hence, the S(1 regions have undergone no drastic variation in their recent divergence and evolution between O. sativa and O. glaberrima, suggesting that a small accumulation of genic changes, following a Bateson-Dobzhansky-Muller (BDM model, might be involved in the establishment of the sterility barrier. In this context, genetic incompatibilities involving the duplicated F-box genes as putative candidates, and a possible strengthening step involving the chromosomal

  11. Sequencing and annotated analysis of the Holstein cow genome.

    Science.gov (United States)

    Kõks, Sulev; Lilleoja, Rutt; Reimann, Ene; Salumets, Andres; Reemann, Paula; Jaakma, Ülle

    2013-08-01

    The aim of our study was to create a high-quality Holstein cow genome reference sequence and describe the different types of variations in this genome compared to the reference Hereford breed. We generated one fragment and three mate-paired libraries from genomic DNA. Raw files were mapped and paired to the reference cow (Bos taurus) genome assemblies bosTau6/UMD_3.1. BioScope (v1.3) software was used for mapping and variant analysis. Initial sequencing resulted in 2,842,744,008 of 50-bp reads. Average mapping efficiency was 78.4 % and altogether 2,168,425,497 reads and 98,022,357,422 bp were successfully mapped, resulting in 36.7X coverage. Tertiary analysis found 5,923,230 SNPs in the bovine genome, of which 3,833,249 were heterozygous and 2,089,981 were homozygous variants. Annotation revealed that 4,241,000 of all discovered SNPs were annotated in the dbSNP database and 1,682,230 SNPs were considered as novel. Large indel variations accounted for 48,537,190 bp of the entire genome and there were 138,504 of them. The largest deletion was 18,594 bp and the largest insertion was 13,498 bp. Another group of variants, small indels (n = 458,061), accounted for the total variation of 1,839,872 nucleotides in the genome. Only 92,115 small indels were listed in the dbSNP and therefore 365,946 small indels were novel. Finally, we identified 1,876 inversions in the bovine genome. In conclusion, this is another description of the Holstein cow genome and, similar to previous studies, we found a large amount of novel variations. Better knowledge of these variations could explain significant phenotypic differences (e.g., health, production, reproduction) between different breeds.

  12. Automatic analysis of the 2015 Gorkha earthquake aftershock sequence.

    Science.gov (United States)

    Baillard, C.; Lyon-Caen, H.; Bollinger, L.; Rietbrock, A.; Letort, J.; Adhikari, L. B.

    2016-12-01

    The Mw 7.8 Gorkha earthquake, that partially ruptured the Main Himalayan Thrust North of Kathmandu on the 25th April 2015, was the largest and most catastrophic earthquake striking Nepal since the great M8.4 1934 earthquake. This mainshock was followed by multiple aftershocks, among them, two notable events that occurred on the 12th May with magnitudes of 7.3 Mw and 6.3 Mw. Due to these recent events it became essential for the authorities and for the scientific community to better evaluate the seismic risk in the region through a detailed analysis of the earthquake catalog, amongst others, the spatio-temporal distribution of the Gorkha aftershock sequence. Here we complement this first study by doing a microseismic study using seismic data coming from the eastern part of the Nepalese Seismological Center network associated to one broadband station in Everest. Our primary goal is to deliver an accurate catalog of the aftershock sequence. Due to the exceptional number of events detected we performed an automatic picking/locating procedure which can be splitted in 4 steps: 1) Coarse picking of the onsets using a classical STA/LTA picker, 2) phase association of picked onsets to detect and declare seismic events, 3) Kurtosis pick refinement around theoretical arrival times to increase picking and location accuracy and, 4) local magnitude calculation based amplitude of waveforms. This procedure is time efficient ( 1 sec/event), reduces considerably the location uncertainties ( 2 to 5 km errors) and increases the number of events detected compared to manual processing. Indeed, the automatic detection rate is 10 times higher than the manual detection rate. By comparing to the USGS catalog we were able to give a new attenuation law to compute local magnitudes in the region. A detailed analysis of the seismicity shows a clear migration toward the east of the region and a sudden decrease of seismicity 100 km east of Kathmandu which may reveal the presence of a tectonic

  13. Sequence Analysis and Comparative Study of the Protein Subunits of Archaeal RNase P

    Directory of Open Access Journals (Sweden)

    Manoj P. Samanta

    2016-04-01

    Full Text Available RNase P, a ribozyme-based ribonucleoprotein (RNP complex that catalyzes tRNA 5′-maturation, is ubiquitous in all domains of life, but the evolution of its protein components (RNase P proteins, RPPs is not well understood. Archaeal RPPs may provide clues on how the complex evolved from an ancient ribozyme to an RNP with multiple archaeal and eukaryotic (homologous RPPs, which are unrelated to the single bacterial RPP. Here, we analyzed the sequence and structure of archaeal RPPs from over 600 available genomes. All five RPPs are found in eight archaeal phyla, suggesting that these RPPs arose early in archaeal evolutionary history. The putative ancestral genomic loci of archaeal RPPs include genes encoding several members of ribosome, exosome, and proteasome complexes, which may indicate coevolution/coordinate regulation of RNase P with other core cellular machineries. Despite being ancient, RPPs generally lack sequence conservation compared to other universal proteins. By analyzing the relative frequency of residues at every position in the context of the high-resolution structures of each of the RPPs (either alone or as functional binary complexes, we suggest residues for mutational analysis that may help uncover structure-function relationships in RPPs.

  14. Sequence analysis of β-esterase isoenzymes related to fertility ...

    African Journals Online (AJOL)

    Two polypeptides, whose molecular weights were 57.1 and 62 kD, were analyzed with Q-TOF mass spectrometry. We obtained three sequences of short peptides from the 57.1 kD polypeptide and two sequences of short peptides from the 62 kD polypeptide. The two short peptides sequences of 62 kD polypeptide were the ...

  15. Complete sequence determination of a novel reptile iridovirus isolated from soft-shelled turtle and evolutionary analysis of Iridoviridae

    Directory of Open Access Journals (Sweden)

    Wang Xiujie

    2009-05-01

    Full Text Available Abstract Background Soft-shelled turtle iridovirus (STIV is the causative agent of severe systemic diseases in cultured soft-shelled turtles (Trionyx sinensis. To our knowledge, the only molecular information available on STIV mainly concerns the highly conserved STIV major capsid protein. The complete sequence of the STIV genome is not yet available. Therefore, determining the genome sequence of STIV and providing a detailed bioinformatic analysis of its genome content and evolution status will facilitate further understanding of the taxonomic elements of STIV and the molecular mechanisms of reptile iridovirus pathogenesis. Results We determined the complete nucleotide sequence of the STIV genome using 454 Life Science sequencing technology. The STIV genome is 105 890 bp in length with a base composition of 55.1% G+C. Computer assisted analysis revealed that the STIV genome contains 105 potential open reading frames (ORFs, which encode polypeptides ranging from 40 to 1,294 amino acids and 20 microRNA candidates. Among the putative proteins, 20 share homology with the ancestral proteins of the nuclear and cytoplasmic large DNA viruses (NCLDVs. Comparative genomic analysis showed that STIV has the highest degree of sequence conservation and a colinear arrangement of genes with frog virus 3 (FV3, followed by Tiger frog virus (TFV, Ambystoma tigrinum virus (ATV, Singapore grouper iridovirus (SGIV, Grouper iridovirus (GIV and other iridovirus isolates. Phylogenetic analysis based on conserved core genes and complete genome sequence of STIV with other virus genomes was performed. Moreover, analysis of the gene gain-and-loss events in the family Iridoviridae suggested that the genes encoded by iridoviruses have evolved for favoring adaptation to different natural host species. Conclusion This study has provided the complete genome sequence of STIV. Phylogenetic analysis suggested that STIV and FV3 are strains of the same viral species belonging to the

  16. Functional annotation of proteomic sequences based on consensus of sequence and structural analysis.

    Science.gov (United States)

    Kitson, David H; Badretdinov, Azat; Zhu, Zhan-yang; Velikanov, Mikhail; Edwards, David J; Olszewski, Krzysztof; Szalma, Sándor; Yan, Lisa

    2002-03-01

    To maximise the assignment of function of the proteins encoded by a genome and to aid the search for novel drug targets, there is an emerging need for sensitive methods of predicting protein function on a genome-wide basis. GeneAtlas is an automated, high-throughput pipeline for the prediction of protein structure and function using sequence similarity detection, homology modelling and fold recognition methods. GeneAtlas is described in detail here. To test GeneAtlas, a 'virtual' genome was used, a subset of PDB structures from the SCOP database, in which the functional relationships are known. GeneAtlas detects additional relationships by building 3D models in comparison with the sequence searching method PSI-BLAST. Functionally related proteins with sequence identity below the twilight zone can be recognised correctly.

  17. Multilocus sequence analysis of Pasteurella multocida demonstrates a type species under development.

    Science.gov (United States)

    Bisgaard, Magne; Petersen, Andreas; Christensen, Henrik

    2013-03-01

    The aim of the present study was to use multilocus sequence typing (MLST) of a diverse collection of Pasteurella multocida with regard to animal source, place and date of collection, including all available serovars of Carter, Heddleston, Little & Lyon, Namioka, Cornelius and Roberts, to further investigate the evolution of this species with a focus on two lineages, A (P. multocida subsp. multocida and P. multocida subsp. gallicida) and B (P. multocida subsp. septica), previously reported. Isolates of P. multocida (n = 116) including reference strains of major serotyping systems were investigated by MLST based on partial sequences of the genes adk, est, gdh, mdh, pgi, pmi and zwf, and 67 sequence types (STs) were observed. Phylogenetic analysis of these concatenated sequences confirmed the separation of groups A (41 STs, 71 isolates) and B (22 STs, 38 isolates) out of the 67 STs. All Carter serovars, 12 Heddleston serovars, all three Little-Lyon types, six out of seven Namioka serovars, all five Roberts types and all four Cornelius serovars were allocated to the A group, while group B included the remaining four Heddleston serovars, 6, 7, 8 and 13, in addition to Namioka type 8 : A. The overrepresentation of reference strains of serotyping systems in the A group contrasts with the high number of isolates obtained from diseased birds in the B group, the effect of which should be addressed in future vaccine development. Isolates from birds (25) dominated the B group, which also included four isolates from Felidae, whereas group A included isolates from all types of hosts. The evolutionary implications of the lack of capsular type D, pig and bovine isolates in group B, as well as its association with Aves and Felidae that also applied to the whole Rural Industries Research and Development Corporation (RIRDC) MLST database, need further investigation. The combination of rpoB and 16S rRNA gene sequence comparison as well as the developed PCR test assigned

  18. Sequence and expression analysis of gaps in human chromosome 20

    DEFF Research Database (Denmark)

    Minocherhomji, Sheroy; Seemann, Stefan; Mang, Yuan

    2012-01-01

    /or overlap disease-associated loci, including the DLGAP4 locus. In this study, we sequenced ~99% of all three unfinished gaps on human chr 20, determined their complete genomic sizes and assessed epigenetic profiles using a combination of Sanger sequencing, mate pair paired-end high-throughput sequencing......The finished human genome-assemblies comprise several hundred un-sequenced euchromatic gaps, which may be rich in long polypurine/polypyrimidine stretches. Human chromosome 20 (chr 20) currently has three unfinished gaps remaining on its q-arm. All three gaps are within gene-dense regions and...

  19. The Dendrobium catenatum Lindl. genome sequence provides insights into polysaccharide synthase, floral development and adaptive evolution.

    Science.gov (United States)

    Zhang, Guo-Qiang; Xu, Qing; Bian, Chao; Tsai, Wen-Chieh; Yeh, Chuan-Ming; Liu, Ke-Wei; Yoshida, Kouki; Zhang, Liang-Sheng; Chang, Song-Bin; Chen, Fei; Shi, Yu; Su, Yong-Yu; Zhang, Yong-Qiang; Chen, Li-Jun; Yin, Yayi; Lin, Min; Huang, Huixia; Deng, Hua; Wang, Zhi-Wen; Zhu, Shi-Lin; Zhao, Xiang; Deng, Cao; Niu, Shan-Ce; Huang, Jie; Wang, Meina; Liu, Guo-Hui; Yang, Hai-Jun; Xiao, Xin-Ju; Hsiao, Yu-Yun; Wu, Wan-Lin; Chen, You-Yi; Mitsuda, Nobutaka; Ohme-Takagi, Masaru; Luo, Yi-Bo; Van de Peer, Yves; Liu, Zhong-Jian

    2016-01-12

    Orchids make up about 10% of all seed plant species, have great economical value, and are of specific scientific interest because of their renowned flowers and ecological adaptations. Here, we report the first draft genome sequence of a lithophytic orchid, Dendrobium catenatum. We predict 28,910 protein-coding genes, and find evidence of a whole genome duplication shared with Phalaenopsis. We observed the expansion of many resistance-related genes, suggesting a powerful immune system responsible for adaptation to a wide range of ecological niches. We also discovered extensive duplication of genes involved in glucomannan synthase activities, likely related to the synthesis of medicinal polysaccharides. Expansion of MADS-box gene clades ANR1, StMADS11, and MIKC(*), involved in the regulation of development and growth, suggests that these expansions are associated with the astonishing diversity of plant architecture in the genus Dendrobium. On the contrary, members of the type I MADS box gene family are missing, which might explain the loss of the endospermous seed. The findings reported here will be important for future studies into polysaccharide synthesis, adaptations to diverse environments and flower architecture of Orchidaceae.

  20. Phylogenetic reconstruction of Bantu kinship challenges Main Sequence Theory of human social evolution.

    Science.gov (United States)

    Opie, Christopher; Shultz, Susanne; Atkinson, Quentin D; Currie, Thomas; Mace, Ruth

    2014-12-09

    Kinship provides the fundamental structure of human society: descent determines the inheritance pattern between generations, whereas residence rules govern the location a couple moves to after they marry. In turn, descent and residence patterns determine other key relationships such as alliance, trade, and marriage partners. Hunter-gatherer kinship patterns are viewed as flexible, whereas agricultural societies are thought to have developed much more stable kinship patterns as they expanded during the Holocene. Among the Bantu farmers of sub-Saharan Africa, the ancestral kinship patterns present at the beginning of the expansion are hotly contested, with some arguing for matrilineal and matrilocal patterns, whereas others maintain that any kind of lineality or sex-biased dispersal only emerged much later. Here, we use Bayesian phylogenetic methods to uncover the history of Bantu kinship patterns and trace the interplay between descent and residence systems. The results suggest a number of switches in both descent and residence patterns as Bantu farming spread, but that the first Bantu populations were patrilocal with patrilineal descent. Across the phylogeny, a change in descent triggered a switch away from patrifocal kinship, whereas a change in residence triggered a switch back from matrifocal kinship. These results challenge "Main Sequence Theory," which maintains that changes in residence rules precede change in other social structures. We also indicate the trajectory of kinship change, shedding new light on how this fundamental structure of society developed as farming spread across the globe during the Neolithic.

  1. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    Science.gov (United States)

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. A global analysis of adaptive evolution of operons in cyanobacteria.

    Science.gov (United States)

    Memon, Danish; Singh, Abhay K; Pakrasi, Himadri B; Wangikar, Pramod P

    2013-02-01

    Operons are an important feature of prokaryotic genomes. Evolution of operons is hypothesized to be adaptive and has contributed significantly towards coordinated optimization of functions. Two conflicting theories, based on (i) in situ formation to achieve co-regulation and (ii) horizontal gene transfer of functionally linked gene clusters, are generally considered to explain why and how operons have evolved. Furthermore, effects of operon evolution on genomic traits such as intergenic spacing, operon size and co-regulation are relatively less explored. Based on the conservation level in a set of diverse prokaryotes, we categorize the operonic gene pair associations and in turn the operons as ancient and recently formed. This allowed us to perform a detailed analysis of operonic structure in cyanobacteria, a morphologically and physiologically diverse group of photoautotrophs. Clustering based on operon conservation showed significant similarity with the 16S rRNA-based phylogeny, which groups the cyanobacterial strains into three clades. Clade C, dominated by strains that are believed to have undergone genome reduction, shows a larger fraction of operonic genes that are tightly packed in larger sized operons. Ancient operons are in general larger, more tightly packed, better optimized for co-regulation and part of key cellular processes. A sub-clade within Clade B, which includes Synechocystis sp. PCC 6803, shows a reverse trend in intergenic spacing. Our results suggest that while in situ formation and vertical descent may be a dominant mechanism of operon evolution in cyanobacteria, optimization of intergenic spacing and co-regulation are part of an ongoing process in the life-cycle of operons.

  3. [Analysis on property of meridian supramolecules by biological evolution path].

    Science.gov (United States)

    Deng, Kaiwen; Tao, Yeqin; Tang, Wenhan; He, Fuyuan; Liu, Wenlong; Shi, Jilian; Yang, Yantao; Zhou, Yiqun; Chang, Xiaorong

    2017-03-12

    With human placed in the whole nature, by following the biologic evolution path, the property of channel structure for "imprinting template" in meridian and zang-fu was explored with supramolecular chemistry. In the history of biologic evolution, each molecule in "molecule society" gradually developed into various highly-ordered supramolecular bodies based on self-identification, self-assembly, self-organization, self-replicating of"imprinting template", and thereby the original biochemical system was established, and finally evolved into human. In the forming process of supramolecular bodies, the channel structure of"imprinting template" in guest supramolecular bodies would be kept by host supramolecular bodies, and communicate with the outside to exchange materials, energy, information, otherwise life phenomenon could not continue, for which it was the chemical nature of biolo-gical supramolecular bodies for body to develop meridian. Therefore, the human was a gigantic and complicated supramolecules body in biological nature, and possessed the supramolecules "imprinting template" at each stage of evolution, for which the meridians were formed. When meridians converged, acupoints appeared; when acupointsconverged, zang-fu appeared. With the promotion of the blood from heart, according to"imprinting template", the guest supramolecular bodies and host meridian produced qi -analysis, which was the qi -phenomenon of guest in meridian. It presented as zang-fu image of physiology and pathology as well as action regularities of medication and acupuncture tolerance, by which current various meridian viewpoints could be explained and propose the hypothesis of meridian supramolecular bodies. The meridian and its phenomenon was decide by its "imprinting template" of supramolecular bodies and self-reaction regularities, which abided through the living nature. This was the substance for meridian biology.

  4. Evolution of a Pathogen: A Comparative Genomics Analysis Identifies a Genetic Pathway to Pathogenesis in Acinetobacter

    Science.gov (United States)

    Sahl, Jason W.; Gillece, John D.; Schupp, James M.; Waddell, Victor G.; Driebe, Elizabeth M.; Engelthaler, David M.; Keim, Paul

    2013-01-01

    Acinetobacter baumannii is an emergent and global nosocomial pathogen. In addition to A. baumannii, other Acinetobacter species, especially those in the Acinetobacter calcoaceticus-baumannii (Acb) complex, have also been associated with serious human infection. Although mechanisms of attachment, persistence on abiotic surfaces, and pathogenesis in A. baumannii have been identified, the genetic mechanisms that explain the emergence of A. baumannii as the most widespread and virulent Acinetobacter species are not fully understood. Recent whole genome sequencing has provided insight into the phylogenetic structure of the genus Acinetobacter. However, a global comparison of genomic features between Acinetobacter spp. has not been described in the literature. In this study, 136 Acinetobacter genomes, including 67 sequenced in this study, were compared to identify the acquisition and loss of genes in the expansion of the Acinetobacter genus. A whole genome phylogeny confirmed that A. baumannii is a monophyletic clade and that the larger Acb complex is also a well-supported monophyletic group. The whole genome phylogeny provided the framework for a global genomic comparison based on a blast score ratio (BSR) analysis. The BSR analysis demonstrated that specific genes have been both lost and acquired in the evolution of A. baumannii. In addition, several genes associated with A. baumannii pathogenesis were found to be more conserved in the Acb complex, and especially in A. baumannii, than in other Acinetobacter genomes; until recently, a global analysis of the distribution and conservation of virulence factors across the genus was not possible. The results demonstrate that the acquisition of specific virulence factors has likely contributed to the widespread persistence and virulence of A. baumannii. The identification of novel features associated with transcriptional regulation and acquired by clades in the Acb complex presents targets for better understanding the

  5. Evolution of a pathogen: a comparative genomics analysis identifies a genetic pathway to pathogenesis in Acinetobacter.

    Directory of Open Access Journals (Sweden)

    Jason W Sahl

    Full Text Available Acinetobacter baumannii is an emergent and global nosocomial pathogen. In addition to A. baumannii, other Acinetobacter species, especially those in the Acinetobacter calcoaceticus-baumannii (Acb complex, have also been associated with serious human infection. Although mechanisms of attachment, persistence on abiotic surfaces, and pathogenesis in A. baumannii have been identified, the genetic mechanisms that explain the emergence of A. baumannii as the most widespread and virulent Acinetobacter species are not fully understood. Recent whole genome sequencing has provided insight into the phylogenetic structure of the genus Acinetobacter. However, a global comparison of genomic features between Acinetobacter spp. has not been described in the literature. In this study, 136 Acinetobacter genomes, including 67 sequenced in this study, were compared to identify the acquisition and loss of genes in the expansion of the Acinetobacter genus. A whole genome phylogeny confirmed that A. baumannii is a monophyletic clade and that the larger Acb complex is also a well-supported monophyletic group. The whole genome phylogeny provided the framework for a global genomic comparison based on a blast score ratio (BSR analysis. The BSR analysis demonstrated that specific genes have been both lost and acquired in the evolution of A. baumannii. In addition, several genes associated with A. baumannii pathogenesis were found to be more conserved in the Acb complex, and especially in A. baumannii, than in other Acinetobacter genomes; until recently, a global analysis of the distribution and conservation of virulence factors across the genus was not possible. The results demonstrate that the acquisition of specific virulence factors has likely contributed to the widespread persistence and virulence of A. baumannii. The identification of novel features associated with transcriptional regulation and acquired by clades in the Acb complex presents targets for better

  6. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima

    DEFF Research Database (Denmark)

    Worning, Peder; Jensen, Lars Juhl; Nelson, K. E.

    2000-01-01

    The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters......, which brings independent evidence for the lateral gene transfer in the genome of T.maritima, The structural analysis relates the Archaea-like DNA sequences to the genome of Pyrococcus horikoshii. Analysis of 24 complete genomic DNA sequences shows different periodicity patterns for organisms...

  7. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA