WorldWideScience

Sample records for fragment sequence assembly

  1. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies and...... in genomic DNA, highly expressed genes and alternative transcripts in EST sequences. We summarize existing comparisons of different assemblers and provide a detailed descriptions and directions for download of assembly programs at: http://genome.ku.dk/resources/assembly/methods.html....

  2. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  3. The dual role of fragments in fragment-assembly methods for de novo protein structure prediction

    Science.gov (United States)

    Handl, Julia; Knowles, Joshua; Vernon, Robert; Baker, David; Lovell, Simon C.

    2013-01-01

    In fragment-assembly techniques for protein structure prediction, models of protein structure are assembled from fragments of known protein structures. This process is typically guided by a knowledge-based energy function and uses a heuristic optimization method. The fragments play two important roles in this process: they define the set of structural parameters available, and they also assume the role of the main variation operators that are used by the optimiser. Previous analysis has typically focused on the first of these roles. In particular, the relationship between local amino acid sequence and local protein structure has been studied by a range of authors. The correlation between the two has been shown to vary with the window length considered, and the results of these analyses have informed directly the choice of fragment length in state-of-the-art prediction techniques. Here, we focus on the second role of fragments and aim to determine the effect of fragment length from an optimization perspective. We use theoretical analyses to reveal how the size and structure of the search space changes as a function of insertion length. Furthermore, empirical analyses are used to explore additional ways in which the size of the fragment insertion influences the search both in a simulation model and for the fragment-assembly technique, Rosetta. PMID:22095594

  4. An improved algorithm for MFR fragment assembly

    International Nuclear Information System (INIS)

    Kontaxis, Georg

    2012-01-01

    A method for generating protein backbone models from backbone only NMR data is presented, which is based on molecular fragment replacement (MFR). In a first step, the PDB database is mined for homologous peptide fragments using experimental backbone-only data i.e. backbone chemical shifts (CS) and residual dipolar couplings (RDC). Second, this fragment library is refined against the experimental restraints. Finally, the fragments are assembled into a protein backbone fold using a rigid body docking algorithm using the RDCs as restraints. For improved performance, backbone nuclear Overhauser effects (NOEs) may be included at that stage. Compared to previous implementations of MFR-derived structure determination protocols this model-building algorithm offers improved stability and reliability. Furthermore, relative to CS-ROSETTA based methods, it provides faster performance and straightforward implementation with the option to easily include further types of restraints and additional energy terms.

  5. DNA fragments assembly based on nicking enzyme system.

    Directory of Open Access Journals (Sweden)

    Rui-Yan Wang

    Full Text Available A couple of DNA ligation-independent cloning (LIC methods have been reported to meet various requirements in metabolic engineering and synthetic biology. The principle of LIC is the assembly of multiple overlapping DNA fragments by single-stranded (ss DNA overlaps annealing. Here we present a method to generate single-stranded DNA overlaps based on Nicking Endonucleases (NEases for LIC, the method was termed NE-LIC. Factors related to cloning efficiency were optimized in this study. This NE-LIC allows generating 3'-end or 5'-end ss DNA overlaps of various lengths for fragments assembly. We demonstrated that the 10 bp/15 bp overlaps had the highest DNA fragments assembling efficiency, while 5 bp/10 bp overlaps showed the highest efficiency when T4 DNA ligase was added. Its advantage over Sequence and Ligation Independent Cloning (SLIC and Uracil-Specific Excision Reagent (USER was obvious. The mechanism can be applied to many other LIC strategies. Finally, the NEases based LIC (NE-LIC was successfully applied to assemble a pathway of six gene fragments responsible for synthesizing microbial poly-3-hydroxybutyrate (PHB.

  6. Mind the gap; seven reasons to close fragmented genome assemblies.

    Science.gov (United States)

    Thomma, Bart P H J; Seidl, Michael F; Shi-Kunne, Xiaoqian; Cook, David E; Bolton, Melvin D; van Kan, Jan A L; Faino, Luigi

    2016-05-01

    Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Theseus Assembly Sequence #2

    Science.gov (United States)

    1996-01-01

    Crew members are seen here assembling the tail of the Theseus prototype research aircraft at NASA's Dryden Flight Research Center, Edwards, California, in May of 1996. The Theseus aircraft, built and operated by Aurora Flight Sciences Corporation, Manassas, Virginia, was a unique aircraft flown at NASA's Dryden Flight Research Center, Edwards, California, under a cooperative agreement between NASA and Aurora. Dryden hosted the Theseus program, providing hangar space and range safety for flight testing. Aurora Flight Sciences was responsible for the actual flight testing, vehicle flight safety, and operation of the aircraft. The Theseus remotely piloted aircraft flew its maiden flight on May 24, 1996, at Dryden. During its sixth flight on November 12, 1996, Theseus experienced an in-flight structural failure that resulted in the loss of the aircraft. As of the beginning of the year 2000, Aurora had not rebuilt the aircraft. Theseus was built for NASA under an innovative, $4.9 million fixed-price contract by Aurora Flight Sciences Corporation and its partners, West Virginia University, Morgantown, West Virginia, and Fairmont State College, Fairmont, West Virginia. The twin-engine, unpiloted vehicle had a 140-foot wingspan, and was constructed largely of composite materials. Powered by two 80-horsepower, turbocharged piston engines that drove twin 9-foot-diameter propellers, Theseus was designed to fly autonomously at high altitudes, with takeoff and landing under the active control of a ground-based pilot in a ground control station 'cockpit.' With the potential ability to carry 700 pounds of science instruments to altitudes above 60,000 feet for durations of greater than 24 hours, Theseus was intended to support research in areas such as stratospheric ozone depletion and the atmospheric effects of future high-speed civil transport aircraft engines. Instruments carried aboard Theseus also would be able to validate satellite-based global environmental change

  8. Theseus Assembly Sequence #3

    Science.gov (United States)

    1996-01-01

    The Theseus prototype research aircraft being assembled at NASA's Dryden Flight Research Center, Edwards, California, in May of 1996. The Theseus aircraft, built and operated by Aurora Flight Sciences Corporation, Manassas, Virginia, was a unique aircraft flown at NASA's Dryden Flight Research Center, Edwards, California, under a cooperative agreement between NASA and Aurora. Dryden hosted the Theseus program, providing hangar space and range safety for flight testing. Aurora Flight Sciences was responsible for the actual flight testing, vehicle flight safety, and operation of the aircraft. The Theseus remotely piloted aircraft flew its maiden flight on May 24, 1996, at Dryden. During its sixth flight on November 12, 1996, Theseus experienced an in-flight structural failure that resulted in the loss of the aircraft. As of the beginning of the year 2000, Aurora had not rebuilt the aircraft. Theseus was built for NASA under an innovative, $4.9 million fixed-price contract by Aurora Flight Sciences Corporation and its partners, West Virginia University, Morgantown, West Virginia, and Fairmont State College, Fairmont, West Virginia. The twin-engine, unpiloted vehicle had a 140-foot wingspan, and was constructed largely of composite materials. Powered by two 80-horsepower, turbocharged piston engines that drove twin 9-foot-diameter propellers, Theseus was designed to fly autonomously at high altitudes, with takeoff and landing under the active control of a ground-based pilot in a ground control station 'cockpit.' With the potential ability to carry 700 pounds of science instruments to altitudes above 60,000 feet for durations of greater than 24 hours, Theseus was intended to support research in areas such as stratospheric ozone depletion and the atmospheric effects of future high-speed civil transport aircraft engines. Instruments carried aboard Theseus also would be able to validate satellite-based global environmental change measurements. Dryden's Project Manager was

  9. Theseus Assembly Sequence #1

    Science.gov (United States)

    1996-01-01

    The Theseus prototype research aircraft being assembled at NASA's Dryden Flight Research Center, Edwards, California, in May of 1996. The Theseus aircraft, built and operated by Aurora Flight Sciences Corporation, Manassas, Virginia, was a unique aircraft flown at NASA's Dryden Flight Research Center, Edwards, California, under a cooperative agreement between NASA and Aurora. Dryden hosted the Theseus program, providing hangar space and range safety for flight testing. Aurora Flight Sciences was responsible for the actual flight testing, vehicle flight safety, and operation of the aircraft. The Theseus remotely piloted aircraft flew its maiden flight on May 24, 1996, at Dryden. During its sixth flight on November 12, 1996, Theseus experienced an in-flight structural failure that resulted in the loss of the aircraft. As of the beginning of the year 2000, Aurora had not rebuilt the aircraft. Theseus was built for NASA under an innovative, $4.9 million fixed-price contract by Aurora Flight Sciences Corporation and its partners, West Virginia University, Morgantown, West Virginia, and Fairmont State College, Fairmont, West Virginia. The twin-engine, unpiloted vehicle had a 140-foot wingspan, and was constructed largely of composite materials. Powered by two 80-horsepower, turbocharged piston engines that drove twin 9-foot-diameter propellers, Theseus was designed to fly autonomously at high altitudes, with takeoff and landing under the active control of a ground-based pilot in a ground control station 'cockpit.' With the potential ability to carry 700 pounds of science instruments to altitudes above 60,000 feet for durations of greater than 24 hours, Theseus was intended to support research in areas such as stratospheric ozone depletion and the atmospheric effects of future high-speed civil transport aircraft engines. Instruments carried aboard Theseus also would be able to validate satellite-based global environmental change measurements. Dryden's Project Manager was

  10. An Efficient Genome Fragment Assembling Using GA with Neighborhood Aware Fitness Function

    Directory of Open Access Journals (Sweden)

    Satoko Kikuchi

    2012-01-01

    Full Text Available To decode a long genome sequence, shotgun sequencing is the state-of-the-art technique. It needs to properly sequence a very large number, sometimes as large as millions, of short partially readable strings (fragments. Arranging those fragments in correct sequence is known as fragment assembling, which is an NP-problem. Presently used methods require enormous computational cost. In this work, we have shown how our modified genetic algorithm (GA could solve this problem efficiently. In the proposed GA, the length of the chromosome, which represents the volume of the search space, is reduced with advancing generations, and thereby improves search efficiency. We also introduced a greedy mutation, by swapping nearby fragments using some heuristics, to improve the fitness of chromosomes. We compared results with Parsons’ algorithm which is based on GA too. We used fragments with partial reads on both sides, mimicking fragments in real genome assembling process. In Parsons’ work base-pair array of the whole fragment is known. Even then, we could obtain much better results, and we succeeded in restructuring contigs covering 100% of the genome sequences.

  11. Habitat Fragmentation Drives Plant Community Assembly Processes across Life Stages

    Science.gov (United States)

    Hu, Guang; Feeley, Kenneth J.; Yu, Mingjian

    2016-01-01

    Habitat fragmentation is one of the principal causes of biodiversity loss and hence understanding its impacts on community assembly and disassembly is an important topic in ecology. We studied the relationships between fragmentation and community assembly processes in the land-bridge island system of Thousand Island Lake in East China. We focused on the changes in species diversity and phylogenetic diversity that occurred between life stages of woody plants growing on these islands. The observed diversities were compared with the expected diversities from random null models to characterize assembly processes. Regression tree analysis was used to illustrate the relationships between island attributes and community assembly processes. We found that different assembly processes predominate in the seedlings-to-saplings life-stage transition (SS) vs. the saplings-to-trees transition (ST). Island area was the main attribute driving the assembly process in SS. In ST, island isolation was more important. Within a fragmented landscape, the factors driving community assembly processes were found to differ between life stage transitions. Environmental filtering had a strong effect on the seedlings-to-saplings life-stage transition. Habitat isolation and dispersal limitation influenced all plant life stages, but had a weaker effect on communities than area. These findings add to our understanding of the processes driving community assembly and species coexistence in the context of pervasive and widespread habitat loss and fragmentation. PMID:27427960

  12. Targeted assembly of short sequence reads.

    Directory of Open Access Journals (Sweden)

    René L Warren

    Full Text Available As next-generation sequence (NGS production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly.

  13. Tablet—next generation sequence assembly visualization

    Science.gov (United States)

    Milne, Iain; Bayer, Micha; Cardle, Linda; Shaw, Paul; Stephen, Gordon; Wright, Frank; Marshall, David

    2010-01-01

    Summary: Tablet is a lightweight, high-performance graphical viewer for next-generation sequence assemblies and alignments. Supporting a range of input assembly formats, Tablet provides high-quality visualizations showing data in packed or stacked views, allowing instant access and navigation to any region of interest, and whole contig overviews and data summaries. Tablet is both multi-core aware and memory efficient, allowing it to handle assemblies containing millions of reads, even on a 32-bit desktop machine. Availability: Tablet is freely available for Microsoft Windows, Apple Mac OS X, Linux and Solaris. Fully bundled installers can be downloaded from http://bioinf.scri.ac.uk/tablet in 32- and 64-bit versions. Contact: tablet@scri.ac.uk PMID:19965881

  14. Syntactic sequencing in Hebbian cell assemblies.

    Science.gov (United States)

    Wennekers, Thomas; Palm, Günther

    2009-12-01

    Hebbian cell assemblies provide a theoretical framework for the modeling of cognitive processes that grounds them in the underlying physiological neural circuits. Recently we have presented an extension of cell assemblies by operational components which allows to model aspects of language, rules, and complex behaviour. In the present work we study the generation of syntactic sequences using operational cell assemblies timed by unspecific trigger signals. Syntactic patterns are implemented in terms of hetero-associative transition graphs in attractor networks which cause a directed flow of activity through the neural state space. We provide regimes for parameters that enable an unspecific excitatory control signal to switch reliably between attractors in accordance with the implemented syntactic rules. If several target attractors are possible in a given state, noise in the system in conjunction with a winner-takes-all mechanism can randomly choose a target. Disambiguation can also be guided by context signals or specific additional external signals. Given a permanently elevated level of external excitation the model can enter an autonomous mode, where it generates temporal grammatical patterns continuously.

  15. [Reconstruction of long polynucleotide sequences from fragments using the Iskra-226 personal computer

    Science.gov (United States)

    Kostetskiĭ, P V; Dobrova, I E

    1988-04-01

    An algorithm for reconstructing long DNA sequences, i.e. arranging all overlapping gel readings in the contigs, and the corresponding BASIC programme for personal computer "Iskra-226" (USSR) are described. The contig construction begins with the search for all fragments overlapping the basic (longest) one follower by determination of coordinates of 5' ends of the overlapping fragments. Then the gel reading with minimal 5' end coordinate and the gel reading with maximal 3' end coordinate are selected and used as basic ones at the next assembly steps. The procedure is finished when no gel reading overlapping the basic one can be found. All gel readings entered the contig are ignored at the next steps of the assembly. Finally, one or several contigs consisted of DNA fragments are obtained. Effectiveness of the algorithm was tested on a model based on the multiple assembly of the nucleotide sequence, encoding the Na, K-ATPase alpha-subunit of pig kidney. The programme does not call for user's participation and can comprise contigs up to 10,000 nucleotides long.

  16. BESST--efficient scaffolding of large fragmented assemblies.

    Science.gov (United States)

    Sahlin, Kristoffer; Vezzi, Francesco; Nystedt, Björn; Lundeberg, Joakim; Arvestad, Lars

    2014-08-15

    The use of short reads from High Throughput Sequencing (HTS) techniques is now commonplace in de novo assembly. Yet, obtaining contiguous assemblies from short reads is challenging, thus making scaffolding an important step in the assembly pipeline. Different algorithms have been proposed but many of them use the number of read pairs supporting a linking of two contigs as an indicator of reliability. This reasoning is intuitive, but fails to account for variation in link count due to contig features.We have also noted that published scaffolders are only evaluated on small datasets using output from only one assembler. Two issues arise from this. Firstly, some of the available tools are not well suited for complex genomes. Secondly, these evaluations provide little support for inferring a software's general performance. We propose a new algorithm, implemented in a tool called BESST, which can scaffold genomes of all sizes and complexities and was used to scaffold the genome of P. abies (20 Gbp). We performed a comprehensive comparison of BESST against the most popular stand-alone scaffolders on a large variety of datasets. Our results confirm that some of the popular scaffolders are not practical to run on complex datasets. Furthermore, no single stand-alone scaffolder outperforms the others on all datasets. However, BESST fares favorably to the other tested scaffolders on GAGE datasets and, moreover, outperforms the other methods when library insert size distribution is wide. We conclude from our results that information sources other than the quantity of links, as is commonly used, can provide useful information about genome structure when scaffolding.

  17. Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences.

    Science.gov (United States)

    Zhang, Jianwei; Kudrna, Dave; Mu, Ting; Li, Weiming; Copetti, Dario; Yu, Yeisoo; Goicoechea, Jose Luis; Lei, Yang; Wing, Rod A

    2016-10-15

    Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool-Genome Puzzle Master (GPM)-that enables the integration of additional genomic signposts to edit and build 'new-gen-assemblies' that result in high-quality 'annotation-ready' pseudomolecules. With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to 'group,' 'merge,' 'order and orient' sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user's total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory. The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS CONTACTS: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  18. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field.

    Science.gov (United States)

    Xu, Dong; Zhang, Yang

    2012-07-01

    Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field. Copyright © 2012 Wiley Periodicals, Inc.

  19. A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Wenyu Zhang

    Full Text Available The advent of next-generation sequencing technologies is accompanied with the development of many whole-genome sequence assembly methods and software, especially for de novo fragment assembly. Due to the poor knowledge about the applicability and performance of these software tools, choosing a befitting assembler becomes a tough task. Here, we provide the information of adaptivity for each program, then above all, compare the performance of eight distinct tools against eight groups of simulated datasets from Solexa sequencing platform. Considering the computational time, maximum random access memory (RAM occupancy, assembly accuracy and integrity, our study indicate that string-based assemblers, overlap-layout-consensus (OLC assemblers are well-suited for very short reads and longer reads of small genomes respectively. For large datasets of more than hundred millions of short reads, De Bruijn graph-based assemblers would be more appropriate. In terms of software implementation, string-based assemblers are superior to graph-based ones, of which SOAPdenovo is complex for the creation of configuration file. Our comparison study will assist researchers in selecting a well-suited assembler and offer essential information for the improvement of existing assemblers or the developing of novel assemblers.

  20. AFEAP cloning: a precise and efficient method for large DNA sequence assembly.

    Science.gov (United States)

    Zeng, Fanli; Zang, Jinping; Zhang, Suhua; Hao, Zhimin; Dong, Jingao; Lin, Yibin

    2017-11-14

    Recent development of DNA assembly technologies has spurred myriad advances in synthetic biology, but new tools are always required for complicated scenarios. Here, we have developed an alternative DNA assembly method named AFEAP cloning (Assembly of Fragment Ends After PCR), which allows scarless, modular, and reliable construction of biological pathways and circuits from basic genetic parts. The AFEAP method requires two-round of PCRs followed by ligation of the sticky ends of DNA fragments. The first PCR yields linear DNA fragments and is followed by a second asymmetric (one primer) PCR and subsequent annealing that inserts overlapping overhangs at both sides of each DNA fragment. The overlapping overhangs of the neighboring DNA fragments annealed and the nick was sealed by T4 DNA ligase, followed by bacterial transformation to yield the desired plasmids. We characterized the capability and limitations of new developed AFEAP cloning and demonstrated its application to assemble DNA with varying scenarios. Under the optimized conditions, AFEAP cloning allows assembly of an 8 kb plasmid from 1-13 fragments with high accuracy (between 80 and 100%), and 8.0, 11.6, 19.6, 28, and 35.6 kb plasmids from five fragments at 91.67, 91.67, 88.33, 86.33, and 81.67% fidelity, respectively. AFEAP cloning also is capable to construct bacterial artificial chromosome (BAC, 200 kb) with a fidelity of 46.7%. AFEAP cloning provides a powerful, efficient, seamless, and sequence-independent DNA assembly tool for multiple fragments up to 13 and large DNA up to 200 kb that expands synthetic biologist's toolbox.

  1. Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

    Science.gov (United States)

    Kisand, Veljo; Lettieri, Teresa

    2013-04-01

    De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize

  2. An assembly sequence planning method based on composite algorithm

    Directory of Open Access Journals (Sweden)

    Enfu LIU

    2016-02-01

    Full Text Available To solve the combination explosion problem and the blind searching problem in assembly sequence planning of complex products, an assembly sequence planning method based on composite algorithm is proposed. In the composite algorithm, a sufficient number of feasible assembly sequences are generated using formalization reasoning algorithm as the initial population of genetic algorithm. Then fuzzy knowledge of assembly is integrated into the planning process of genetic algorithm and ant algorithm to get the accurate solution. At last, an example is conducted to verify the feasibility of composite algorithm.

  3. Cloning Should Be Simple: Escherichia coli DH5α-Mediated Assembly of Multiple DNA Fragments with Short End Homologies

    Science.gov (United States)

    Richardson, Ruth E.; Suzuki, Yo

    2015-01-01

    Numerous DNA assembly technologies exist for generating plasmids for biological studies. Many procedures require complex in vitro or in vivo assembly reactions followed by plasmid propagation in recombination-impaired Escherichia coli strains such as DH5α, which are optimal for stable amplification of the DNA materials. Here we show that despite its utility as a cloning strain, DH5α retains sufficient recombinase activity to assemble up to six double-stranded DNA fragments ranging in size from 150 bp to at least 7 kb into plasmids in vivo. This process also requires surprisingly small amounts of DNA, potentially obviating the need for upstream assembly processes associated with most common applications of DNA assembly. We demonstrate the application of this process in cloning of various DNA fragments including synthetic genes, preparation of knockout constructs, and incorporation of guide RNA sequences in constructs for clustered regularly interspaced short palindromic repeats (CRISPR) genome editing. This consolidated process for assembly and amplification in a widely available strain of E. coli may enable productivity gain across disciplines involving recombinant DNA work. PMID:26348330

  4. An investigation of Hebbian phase sequences as assembly graphs

    Directory of Open Access Journals (Sweden)

    Daniel Gomes Almeida Filho

    2014-04-01

    Full Text Available Hebb proposed that synapses between neurons that fire synchronously are strengthened, forming cell assemblies and phase sequences. The former, on a shorter scale, are ensembles of synchronized cells that function transiently as a closed processing system; the latter, on a larger scale, correspond to the sequential activation of cell assemblies able to represent percepts and behaviors. Nowadays, the recording of large neuronal populations allows for the detection of multiple cell assemblies. Within Hebb’s theory, the next logical step is the analysis of phase sequences. Here we detected phase sequences as consecutive assembly activation patterns, and then analyzed their graph attributes in relation to behavior. We investigated action potentials recorded from the adult rat hippocampus and neocortex before, during and after novel object exploration (experimental periods. Within assembly graphs, each assembly corresponded to a node, and each edge corresponded to the temporal sequence of consecutive node activations. The sum of all assembly activations was proportional to firing rates, but the activity of individual assemblies was not. Assembly repertoire was stable across experimental periods, suggesting that novel experience does not create new assemblies in the adult rat. Assembly graph attributes, on the other hand, varied significantly across behavioral states and experimental periods, and were separable enough to correctly classify experimental periods (Naïve Bayes classifier; maximum AUROCs ranging from 0.55 to 0.99 and behavioral states (waking, slow wave sleep, and rapid eye movement sleep; maximum AUROCs s ranging from 0.64 to 0.98. Our findings agree with Hebb’s view that assemblies correspond to primitive building blocks of representation, nearly unchanged in the adult, while phase sequences are labile across behavioral states and change after novel experience. The results are compatible with a role for phase sequences in behavior

  5. Solving Assembly Sequence Planning using Angle Modulated Simulated Kalman Filter

    Science.gov (United States)

    Mustapa, Ainizar; Yusof, Zulkifli Md.; Adam, Asrul; Muhammad, Badaruddin; Ibrahim, Zuwairie

    2018-03-01

    This paper presents an implementation of Simulated Kalman Filter (SKF) algorithm for optimizing an Assembly Sequence Planning (ASP) problem. The SKF search strategy contains three simple steps; predict-measure-estimate. The main objective of the ASP is to determine the sequence of component installation to shorten assembly time or save assembly costs. Initially, permutation sequence is generated to represent each agent. Each agent is then subjected to a precedence matrix constraint to produce feasible assembly sequence. Next, the Angle Modulated SKF (AMSKF) is proposed for solving ASP problem. The main idea of the angle modulated approach in solving combinatorial optimization problem is to use a function, g(x), to create a continuous signal. The performance of the proposed AMSKF is compared against previous works in solving ASP by applying BGSA, BPSO, and MSPSO. Using a case study of ASP, the results show that AMSKF outperformed all the algorithms in obtaining the best solution.

  6. Cyprinus carpio Genome sequencing and assembly

    NARCIS (Netherlands)

    Kolder, I.C.R.M.; Plas-Duivesteijn, van der Suzanne J.; Tan, G.; Wiegertjes, G.; Forlenza, M.; Guler, A.T.; Travin, D.Y.; Nakao, M.; Moritomo, T.; Irnazarow, I.; Jansen, H.J.

    2013-01-01

    Sequencing of the common carp (Cyprinus carpio carpio Linnaeus, 1758) genome, with the objective of establishing carp as a model organism to supplement the closely related zebrafish (Danio rerio). The sequenced individual is a homozygous female (by gynogenesis) of R3 x R8 carp, the heterozygous

  7. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  8. Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation.

    Science.gov (United States)

    Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro

    2015-11-18

    RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as

  9. Physical mapping of 20 unmapped fragments of the btau_4.0 genome assembly in cattle, sheep and river buffalo.

    Science.gov (United States)

    De Lorenzi, L; Genualdo, V; Perucatti, A; Iannuzzi, A; Iannuzzi, L; Parma, P

    2013-01-01

    The recent advances in sequencing technology and bioinformatics have revolutionized genomic research, making the decoding of the genome an easier task. Genome sequences are currently available for many species, including cattle, sheep and river buffalo. The available reference genomes are very accurate, and they represent the best possible order of loci at this time. In cattle, despite the great accuracy achieved, a part of the genome has been sequenced but not yet assembled: these genome fragments are called unmapped fragments. In the present study, 20 unmapped fragments belonging to the Btau_4.0 reference genome have been mapped by FISH in cattle (Bos taurus, 2n = 60), sheep (Ovis aries, 2n = 54) and river buffalo (Bubalus bubalis, 2n = 50). Our results confirm the accuracy of the available reference genome, though there are some discrepancies between the expected localization and the observed localization. Moreover, the available data in the literature regarding genomic homologies between cattle, sheep and river buffalo are confirmed. Finally, the results presented here suggest that FISH was, and still is, a useful technology to validate the data produced by genome sequencing programs. Copyright © 2013 S. Karger AG, Basel.

  10. Oxford Nanopore MinION Sequencing and Genome Assembly

    Directory of Open Access Journals (Sweden)

    Hengyun Lu

    2016-10-01

    Full Text Available The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS technology. The third-generation sequencing (TGS technology, led by Pacific Biosciences (PacBio, is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT. MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assembly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.

  11. Assembly of the Complete Sitka Spruce Chloroplast Genome Using 10X Genomics' GemCode Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Lauren Coombe

    Full Text Available The linked read sequencing library preparation platform by 10X Genomics produces barcoded sequencing libraries, which are subsequently sequenced using the Illumina short read sequencing technology. In this new approach, long fragments of DNA are partitioned into separate micro-reactions, where the same index sequence is incorporated into each of the sequencing fragment inserts derived from a given long fragment. In this study, we exploited this property by using reads from index sequences associated with a large number of reads, to assemble the chloroplast genome of the Sitka spruce tree (Picea sitchensis. Here we report on the first Sitka spruce chloroplast genome assembled exclusively from P. sitchensis genomic libraries prepared using the 10X Genomics protocol. We show that the resulting 124,049 base pair long genome shares high sequence similarity with the related white spruce and Norway spruce chloroplast genomes, but diverges substantially from a previously published P. sitchensis- P. thunbergii chimeric genome. The use of reads from high-frequency indices enabled separation of the nuclear genome reads from that of the chloroplast, which resulted in the simplification of the de Bruijn graphs used at the various stages of assembly.

  12. INTEGRATION OF SHIP HULL ASSEMBLY SEQUENCE PLANNING, SCHEDULING AND BUDGETING

    Directory of Open Access Journals (Sweden)

    Remigiusz Romuald Iwańkowicz

    2015-02-01

    Full Text Available The specificity of the yard work requires the particularly careful treatment of the issues of scheduling and budgeting in the production planning processes. The article presents the method of analysis of the assembly sequence taking into account the duration of individual activities and the demand for resources. A method of the critical path and resource budgeting were used. Modelling of the assembly was performed using the acyclic graphs. It has been shown that the assembly sequences can have very different feasible budget regions. The proposed model is applied to the assembly processes of large-scale welded structures, including the hulls of ships. The presented computational examples have a simulation character. They show the usefulness of the model and the possibility to use it in a variety of analyses.

  13. Next-generation sequencing of multiple individuals per barcoded library by deconvolution of sequenced amplicons using endonuclease fragment analysis

    DEFF Research Database (Denmark)

    Andersen, Jeppe D; Pereira, Vania; Pietroni, Carlotta

    2014-01-01

    The simultaneous sequencing of samples from multiple individuals increases the efficiency of next-generation sequencing (NGS) while also reducing costs. Here we describe a novel and simple approach for sequencing DNA from multiple individuals per barcode. Our strategy relies on the endonuclease...... digestion of PCR amplicons prior to library preparation, creating a specific fragment pattern for each individual that can be resolved after sequencing. By using both barcodes and restriction fragment patterns, we demonstrate the ability to sequence the human melanocortin 1 receptor (MC1R) genes from 72...... individuals using only 24 barcoded libraries....

  14. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  15. Sequencing and De Novo Transcriptome Assembly of Brachypodium sylvaticum (Poaceae

    Directory of Open Access Journals (Sweden)

    Samuel E. Fox

    2013-03-01

    Full Text Available Premise of the study: We report the de novo assembly and characterization of the transcriptomes of Brachypodium sylvaticum (slender false-brome accessions from native populations of Spain and Greece, and an invasive population west of Corvallis, Oregon, USA. Methods and Results: More than 350 million sequence reads from the mRNA libraries prepared from three B. sylvaticum genotypes were assembled into 120,091 (Corvallis, 104,950 (Spain, and 177,682 (Greece transcript contigs. In comparison with the B. distachyon Bd21 reference genome and GenBank protein sequences, we estimate >90% exome coverage for B. sylvaticum. The transcripts were assigned Gene Ontology and InterPro annotations. Brachypodium sylvaticum sequence reads aligned against the Bd21 genome revealed 394,654 single-nucleotide polymorphisms (SNPs and >20,000 simple sequence repeat (SSR DNA sites. Conclusions: To our knowledge, this is the first report of transcriptome sequencing of invasive plant species with a closely related sequenced reference genome. The sequences and identified SNP variant and SSR sites will provide tools for developing novel genetic markers for use in genotyping and characterization of invasive behavior of B. sylvaticum.

  16. Performances of Different Fragment Sizes for Reduced Representation Bisulfite Sequencing in Pigs

    DEFF Research Database (Denmark)

    Yuan, Xiao Long; Zhang, Zhe; Pan, Rong Yang

    2017-01-01

    sizes might decrease when the dataset size was more than 70, 50 and 110 million reads for these three fragment sizes, respectively. Given a 50-million dataset size, the average sequencing depth of the detected CpG sites in the 110-220 bp fragment size appeared to be deeper than in the 40-110 bp and 40...

  17. An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile.

    Science.gov (United States)

    Prakash, Celine; Haeseler, Arndt Von

    2017-03-01

    RNA sequencing (RNA-seq) has emerged as the method of choice for measuring the expression of RNAs in a given cell population. In most RNA-seq technologies, sequencing the full length of RNA molecules requires fragmentation into smaller pieces. Unfortunately, the issue of nonuniform sequencing coverage across a genomic feature has been a concern in RNA-seq and is attributed to biases for certain fragments in RNA-seq library preparation and sequencing. To investigate the expected coverage obtained from fragmentation, we develop a simple fragmentation model that is independent of bias from the experimental method and is not specific to the transcript sequence. Essentially, we enumerate all configurations for maximal placement of a given fragment length, F, on transcript length, T, to represent every possible fragmentation pattern, from which we compute the expected coverage profile across a transcript. We extend this model to incorporate general empirical attributes such as read length, fragment length distribution, and number of molecules of the transcript. We further introduce the fragment starting-point, fragment coverage, and read coverage profiles. We find that the expected profiles are not uniform and that factors such as fragment length to transcript length ratio, read length to fragment length ratio, fragment length distribution, and number of molecules influence the variability of coverage across a transcript. Finally, we explore a potential application of the model where, with simulations, we show that it is possible to correctly estimate the transcript copy number for any transcript in the RNA-seq experiment.

  18. Assessment of metagenomic assembly using simulated next generation sequencing data

    DEFF Research Database (Denmark)

    Mende, Daniel R; Waller, Alison S; Sunagawa, Shinichi

    2012-01-01

    with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved...... the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition...... the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities...

  19. The de novo assembly of mitochondrial genomes of the extinct passenger pigeon (Ectopistes migratorius with next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Chih-Ming Hung

    Full Text Available The information from ancient DNA (aDNA provides an unparalleled opportunity to infer phylogenetic relationships and population history of extinct species and to investigate genetic evolution directly. However, the degraded and fragmented nature of aDNA has posed technical challenges for studies based on conventional PCR amplification. In this study, we present an approach based on next generation sequencing to efficiently sequence the complete mitochondrial genome (mitogenome of two extinct passenger pigeons (Ectopistes migratorius using de novo assembly of massive short (90 bp, paired-end or single-end reads. Although varying levels of human contamination and low levels of postmortem nucleotide lesion were observed, they did not impact sequencing accuracy. Our results demonstrated that the de novo assembly of shotgun sequence reads could be a potent approach to sequence mitogenomes, and offered an efficient way to infer evolutionary history of extinct species.

  20. The De Novo Assembly of Mitochondrial Genomes of the Extinct Passenger Pigeon (Ectopistes migratorius) with Next Generation Sequencing

    Science.gov (United States)

    Hung, Chih-Ming; Lin, Rong-Chien; Chu, Jui-Hua; Yeh, Chia-Fen; Yao, Chiou-Ju; Li, Shou-Hsien

    2013-01-01

    The information from ancient DNA (aDNA) provides an unparalleled opportunity to infer phylogenetic relationships and population history of extinct species and to investigate genetic evolution directly. However, the degraded and fragmented nature of aDNA has posed technical challenges for studies based on conventional PCR amplification. In this study, we present an approach based on next generation sequencing to efficiently sequence the complete mitochondrial genome (mitogenome) of two extinct passenger pigeons (Ectopistes migratorius) using de novo assembly of massive short (90 bp), paired-end or single-end reads. Although varying levels of human contamination and low levels of postmortem nucleotide lesion were observed, they did not impact sequencing accuracy. Our results demonstrated that the de novo assembly of shotgun sequence reads could be a potent approach to sequence mitogenomes, and offered an efficient way to infer evolutionary history of extinct species. PMID:23437111

  1. SeqLib: a C ++ API for rapid BAM manipulation, sequence alignment and sequence assembly.

    Science.gov (United States)

    Wala, Jeremiah; Beroukhim, Rameen

    2017-03-01

    We present SeqLib, a C ++ API and command line tool that provides a rapid and user-friendly interface to BAM/SAM/CRAM files, global sequence alignment operations and sequence assembly. Four C libraries perform core operations in SeqLib: HTSlib for BAM access, BWA-MEM and BLAT for sequence alignment and Fermi for error correction and sequence assembly. Benchmarking indicates that SeqLib has lower CPU and memory requirements than leading C ++ sequence analysis APIs. We demonstrate an example of how minimal SeqLib code can extract, error-correct and assemble reads from a CRAM file and then align with BWA-MEM. SeqLib also provides additional capabilities, including chromosome-aware interval queries and read plotting. Command line tools are available for performing integrated error correction, micro-assemblies and alignment. SeqLib is available on Linux and OSX for the C ++98 standard and later at github.com/walaj/SeqLib. SeqLib is released under the Apache2 license. Additional capabilities for BLAT alignment are available under the BLAT license. jwala@broadinstitue.org ; rameen@broadinstitute.org. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  2. DNA Sequences of RAPD Fragments in the Egyptian cotton ...

    African Journals Online (AJOL)

    Random Amplified Polymorphic DNAs (RAPDs) is a DNA polymorphism assay based on the amplification of random DNA segments with single primers of arbitrary nucleotide sequence. Despite the fact that the RAPD technique has become a very powerful tool and has found use in numerous applications, yet, the nature of ...

  3. De novo protein structure prediction by dynamic fragment assembly and conformational space annealing.

    Science.gov (United States)

    Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung

    2011-08-01

    Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.

  4. Multifunctional hybrid networks based on self assembling peptide sequences

    Science.gov (United States)

    Sathaye, Sameer

    The overall aim of this dissertation is to achieve a comprehensive correlation between the molecular level changes in primary amino acid sequences of amphiphilic beta-hairpin peptides and their consequent solution-assembly properties and bulk network hydrogel behavior. This has been accomplished using two broad approaches. In the first approach, amino acid substitutions were made to peptide sequence MAX1 such that the hydrophobic surfaces of the folded beta-hairpins from the peptides demonstrate shape specificity in hydrophobic interactions with other beta-hairpins during the assembly process, thereby causing changes to the peptide nanostructure and bulk rheological properties of hydrogels formed from the peptides. Steric lock and key complementary hydrophobic interactions were designed to occur between two beta-hairpin molecules of a single molecule, LNK1 during beta-sheet fibrillar assembly of LNK1. Experimental results from circular dichroism, transmission electron microscopy and oscillatory rheology collectively indicate that the molecular design of the LNK1 peptide can be assigned the cause of the drastically different behavior of the networks relative to MAX1. The results indicate elimination or significant reduction of fibrillar branching due to steric complementarity in LNK1 that does not exist in MAX1, thus supporting the original hypothesis. As an extension of the designed steric lock and key complementarity between two beta-hairpin molecules of the same peptide molecule. LNK1, three new pairs of peptide molecules LP1-KP1, LP2-KP2 and LP3-KP3 that resemble complementary 'wedge' and 'trough' shapes when folded into beta-hairpins were designed and studied. All six peptides individually and when blended with their corresponding shape complement formed fibrillar nanostructures with non-uniform thickness values. Loose packing in the assembled structures was observed in all the new peptides as compared to the uniform tight packing in MAX1 by SANS analysis. This

  5. 2D nanomaterials assembled from sequence-defined molecules

    International Nuclear Information System (INIS)

    Mu, Peng; State University of New York; Zhou, Guangwen; Chen, Chun-Long

    2017-01-01

    Two dimensional (2D) nanomaterials have attracted broad interest owing to their unique physical and chemical properties with potential applications in electronics, chemistry, biology, medicine and pharmaceutics. Due to the current limitations of traditional 2D nanomaterials (e.g., graphene and graphene oxide) in tuning surface chemistry and compositions, 2D nanomaterials assembled from sequence-defined molecules (e.g., DNAs, proteins, peptides and peptoids) have recently been developed. They represent an emerging class of 2D nanomaterials with attractive physical and chemical properties. Here, we summarize the recent progress in the synthesis and applications of this type of sequence-defined 2D nanomaterials. We also discuss the challenges and opportunities in this new field.

  6. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    Science.gov (United States)

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  7. Performances of Different Fragment Sizes for Reduced Representation Bisulfite Sequencing in Pigs.

    Science.gov (United States)

    Yuan, Xiao-Long; Zhang, Zhe; Pan, Rong-Yang; Gao, Ning; Deng, Xi; Li, Bin; Zhang, Hao; Sangild, Per Torp; Li, Jia-Qi

    2017-01-01

    Reduced representation bisulfite sequencing (RRBS) has been widely used to profile genome-scale DNA methylation in mammalian genomes. However, the applications and technical performances of RRBS with different fragment sizes have not been systematically reported in pigs, which serve as one of the important biomedical models for humans. The aims of this study were to evaluate capacities of RRBS libraries with different fragment sizes to characterize the porcine genome. We found that the Msp I-digested segments between 40 and 220 bp harbored a high distribution peak at 74 bp, which were highly overlapped with the repetitive elements and might reduce the unique mapping alignment. The RRBS library of 110-220 bp fragment size had the highest unique mapping alignment and the lowest multiple alignment. The cost-effectiveness of the 40-110 bp, 110-220 bp and 40-220 bp fragment sizes might decrease when the dataset size was more than 70, 50 and 110 million reads for these three fragment sizes, respectively. Given a 50-million dataset size, the average sequencing depth of the detected CpG sites in the 110-220 bp fragment size appeared to be deeper than in the 40-110 bp and 40-220 bp fragment sizes, and these detected CpG sties differently located in gene- and CpG island-related regions. In this study, our results demonstrated that selections of fragment sizes could affect the numbers and sequencing depth of detected CpG sites as well as the cost-efficiency. No single solution of RRBS is optimal in all circumstances for investigating genome-scale DNA methylation. This work provides the useful knowledge on designing and executing RRBS for investigating the genome-wide DNA methylation in tissues from pigs.

  8. Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites*

    OpenAIRE

    Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying

    2012-01-01

    To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi’an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was succe...

  9. Path planning algorithms for assembly sequence planning. [in robot kinematics

    Science.gov (United States)

    Krishnan, S. S.; Sanderson, Arthur C.

    1991-01-01

    Planning for manipulation in complex environments often requires reasoning about the geometric and mechanical constraints which are posed by the task. In planning assembly operations, the automatic generation of operations sequences depends on the geometric feasibility of paths which permit parts to be joined into subassemblies. Feasible locations and collision-free paths must be present for part motions, robot and grasping motions, and fixtures. This paper describes an approach to reasoning about the feasibility of straight-line paths among three-dimensional polyhedral parts using an algebra of polyhedral cones. A second method recasts the feasibility conditions as constraints in a nonlinear optimization framework. Both algorithms have been implemented and results are presented.

  10. Capillary electrophoresis fragment analysis and clone sequencing in detection of dynamic mutations of spinocerebellar ataxia

    Directory of Open Access Journals (Sweden)

    Yuan-yuan CHEN

    2018-04-01

    Full Text Available Objective To estimate the accuracy and stability of capillary electrophoresis fragment analysis and clone sequencing in detecting dynamic mutations of spinocerebellar ataxia (SCA. Methods Capillary electrophoresis fragment analysis and clone sequencing were used in detecting trinucleotide repeated sequence of 14 SCA patients (3 cases of SCA2, 2 cases of SCA7, 7 cases of SCA8 and 2 cases of SCA17. Results Capillary electrophoresis fragment analysis of 3 SCA2 cases showed the expanded cytosine-adenine-guanine (CAG repeats were 31, 30 and 32, and the copy numbers of 3 clone sequencing for 3 colonies in each case were 37/40/40, 37/38/39 and 38/39/40 respectively. Capillary electrophoresis fragment analysis of 2 SCA7 cases showed the expanded CAG repeats were 57 and 34, and the copy numbers of repeats were 69, 74, 75 in 3 colonies of one case, and was 45 in the other case. For the 7 SCA8 cases with the expanded cytosine-thymine-adenine (CTA/cytosine-thymine-guanine (CTG repeats of 99, 111, 104, 92, 89, 104 and 75, the results of clone sequencing were 97, 116, 104, 90, 90, 102 and 76 respectively. For 2 SCA17 cases with the short/expanded CAG repeats of 37/50 and 36/45, the results of clone sequencing were 51/50/52 and 45/44 for 3 and 2 colonies. Conclusions Although the higher mobility of polymerase chain reaction (PCR products containing dynamic mutation in the capillary electrophoresis fragment analysis might cause the deviation for analysis of copy numbers, the deviation was predictable and the results were repeatable. The clone sequencing results showed obvious instability, especially for SCA2 and SCA7 genes, which might owing to their simple CAG repeats. Consequently, clone sequencing is not suited for detection of dynamic mutation, not to mention the quantitative criteria of dynamic mutation sequencing. DOI: 10.3969/j.issn.1672-6731.2018.03.008

  11. Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites*

    Science.gov (United States)

    Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying

    2012-01-01

    To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi’an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi’an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%–99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites. PMID:23024043

  12. Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites.

    Science.gov (United States)

    Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying

    2012-10-01

    To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi'an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi'an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%-99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites.

  13. Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads.

    Science.gov (United States)

    Huson, Daniel H; Tappu, Rewati; Bazinet, Adam L; Xie, Chao; Cummings, Michael P; Nieselt, Kay; Williams, Rohan

    2017-01-25

    Microbiome sequencing projects typically collect tens of millions of short reads per sample. Depending on the goals of the project, the short reads can either be subjected to direct sequence analysis or be assembled into longer contigs. The assembly of whole genomes from metagenomic sequencing reads is a very difficult problem. However, for some questions, only specific genes of interest need to be assembled. This is then a gene-centric assembly where the goal is to assemble reads into contigs for a family of orthologous genes. We present a new method for performing gene-centric assembly, called protein-alignment-guided assembly, and provide an implementation in our metagenome analysis tool MEGAN. Genes are assembled on the fly, based on the alignment of all reads against a protein reference database such as NCBI-nr. Specifically, the user selects a gene family based on a classification such as KEGG and all reads binned to that gene family are assembled. Using published synthetic community metagenome sequencing reads and a set of 41 gene families, we show that the performance of this approach compares favorably with that of full-featured assemblers and that of a recently published HMM-based gene-centric assembler, both in terms of the number of reference genes detected and of the percentage of reference sequence covered. Protein-alignment-guided assembly of orthologous gene families complements whole-metagenome assembly in a new and very useful way.

  14. GABenchToB: a genome assembly benchmark tuned on bacteria and benchtop sequencers.

    Directory of Open Access Journals (Sweden)

    Sebastian Jünemann

    Full Text Available De novo genome assembly is the process of reconstructing a complete genomic sequence from countless small sequencing reads. Due to the complexity of this task, numerous genome assemblers have been developed to cope with different requirements and the different kinds of data provided by sequencers within the fast evolving field of next-generation sequencing technologies. In particular, the recently introduced generation of benchtop sequencers, like Illumina's MiSeq and Ion Torrent's Personal Genome Machine (PGM, popularized the easy, fast, and cheap sequencing of bacterial organisms to a broad range of academic and clinical institutions. With a strong pragmatic focus, here, we give a novel insight into the line of assembly evaluation surveys as we benchmark popular de novo genome assemblers based on bacterial data generated by benchtop sequencers. Therefore, single-library assemblies were generated, assembled, and compared to each other by metrics describing assembly contiguity and accuracy, and also by practice-oriented criteria as for instance computing time. In addition, we extensively analyzed the effect of the depth of coverage on the genome assemblies within reasonable ranges and the k-mer optimization problem of de Bruijn Graph assemblers. Our results show that, although both MiSeq and PGM allow for good genome assemblies, they require different approaches. They not only pair with different assembler types, but also affect assemblies differently regarding the depth of coverage where oversampling can become problematic. Assemblies vary greatly with respect to contiguity and accuracy but also by the requirement on the computing power. Consequently, no assembler can be rated best for all preconditions. Instead, the given kind of data, the demands on assembly quality, and the available computing infrastructure determines which assembler suits best. The data sets, scripts and all additional information needed to replicate our results are freely

  15. Single molecule sequencing-guided scaffolding and correction of draft assemblies.

    Science.gov (United States)

    Zhu, Shenglong; Chen, Danny Z; Emrich, Scott J

    2017-12-06

    Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de novo assemblies. We propose a disassembling-reassembling approach for both correcting structural errors in the draft assembly and scaffolding a target assembly based on error-corrected single molecule sequences. To achieve this goal, we formulate a maximum alternating path cover problem. We prove that this problem is NP-hard, and solve it by a 2-approximation algorithm. Our experimental results show that our approach can improve the structural correctness of target assemblies in the cost of some contiguity, even with smaller amounts of long reads. In addition, our reassembling process can also serve as a competitive scaffolder relative to well-established assembly benchmarks.

  16. Assembly of Repeat Content Using Next Generation Sequencing Data

    Energy Technology Data Exchange (ETDEWEB)

    labutti, Kurt; Kuo, Alan; Grigoriev, Igor; Copeland, Alex

    2014-03-17

    Repetitive organisms pose a challenge for short read assembly, and typically only unique regions and repeat regions shorter than the read length, can be accurately assembled. Recently, we have been investigating the use of Pacific Biosciences reads for de novo fungal assembly. We will present an assessment of the quality and degree of repeat reconstruction possible in a fungal genome using long read technology. We will also compare differences in assembly of repeat content using short read and long read technology.

  17. The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

    Science.gov (United States)

    Mavromatis, Konstantinos; Land, Miriam L; Brettin, Thomas S; Quest, Daniel J; Copeland, Alex; Clum, Alicia; Goodwin, Lynne; Woyke, Tanja; Lapidus, Alla; Klenk, Hans Peter; Cottingham, Robert W; Kyrpides, Nikos C

    2012-01-01

    The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).

  18. incaRNAfbinv: a web server for the fragment-based design of RNA sequences

    Science.gov (United States)

    Drory Retwitzer, Matan; Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme; Barash, Danny

    2016-01-01

    Abstract In recent years, new methods for computational RNA design have been developed and applied to various problems in synthetic biology and nanotechnology. Lately, there is considerable interest in incorporating essential biological information when solving the inverse RNA folding problem. Correspondingly, RNAfbinv aims at including biologically meaningful constraints and is the only program to-date that performs a fragment-based design of RNA sequences. In doing so it allows the design of sequences that do not necessarily exactly fold into the target, as long as the overall coarse-grained tree graph shape is preserved. Augmented by the weighted sampling algorithm of incaRNAtion, our web server called incaRNAfbinv implements the method devised in RNAfbinv and offers an interactive environment for the inverse folding of RNA using a fragment-based design approach. It takes as input: a target RNA secondary structure; optional sequence and motif constraints; optional target minimum free energy, neutrality and GC content. In addition to the design of synthetic regulatory sequences, it can be used as a pre-processing step for the detection of novel natural occurring RNAs. The two complementary methodologies RNAfbinv and incaRNAtion are merged together and fully implemented in our web server incaRNAfbinv, available at http://www.cs.bgu.ac.il/incaRNAfbinv. PMID:27185893

  19. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

    Science.gov (United States)

    Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

    2013-09-24

    Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.

  20. Optimum Assembly Sequence Planning System Using Discrete Artificial Bee Colony Algorithm

    Directory of Open Access Journals (Sweden)

    Özkan Özmen

    2018-01-01

    Full Text Available Assembly refers both to the process of combining parts to create a structure and to the product resulting therefrom. The complexity of this process increases with the number of pieces in the assembly. This paper presents the assembly planning system design (APSD program, a computer program developed based on a matrix-based approach and the discrete artificial bee colony (DABC algorithm, which determines the optimum assembly sequence among numerous feasible assembly sequences (FAS. Specifically, the assembly sequences of three-dimensional (3D parts prepared in the computer-aided design (CAD software AutoCAD are first coded using the matrix-based methodology and the resulting FAS are assessed and the optimum assembly sequence is selected according to the assembly time optimisation criterion using DABC. The results of comparison of the performance of the proposed method with other methods proposed in the literature verify its superiority in finding the sequence with the lowest overall time. Further, examination of the results of application of APSD to assemblies consisting of parts in different numbers and shapes shows that it can select the optimum sequence from among hundreds of FAS.

  1. Scar-less multi-part DNA assembly design automation

    Science.gov (United States)

    Hillson, Nathan J.

    2016-06-07

    The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.

  2. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...

  3. De novo assembly of human genomes with massively parallel short read sequencing

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Zhu, Hongmei; Ruan, Jue

    2010-01-01

    genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities...... for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way....

  4. Magnetic bead purification of labeled DNA fragments forhigh-throughput capillary electrophoresis sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Elkin, Christopher; Kapur, Hitesh; Smith, Troy; Humphries, David; Pollard, Martin; Hammon, Nancy; Hawkins, Trevor

    2001-09-15

    We have developed an automated purification method for terminator sequencing products based on a magnetic bead technology. This 384-well protocol generates labeled DNA fragments that are essentially free of contaminates for less than $0.005 per reaction. In comparison to laborious ethanol precipitation protocols, this method increases the phred20 read length by forty bases with various DNA templates such as PCR fragments, Plasmids, Cosmids and RCA products. Our method eliminates centrifugation and is compatible with both the MegaBACE 1000 and ABIPrism 3700 capillary instruments. As of September 2001, this method has produced over 1.6 million samples with 93 percent averaging 620 phred20 bases as part of Joint Genome Institutes Production Process.

  5. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

    Science.gov (United States)

    Chin, Chen-Shan; Alexander, David H; Marks, Patrick; Klammer, Aaron A; Drake, James; Heiner, Cheryl; Clum, Alicia; Copeland, Alex; Huddleston, John; Eichler, Evan E; Turner, Stephen W; Korlach, Jonas

    2013-06-01

    We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.

  6. Transcriptome sequencing and de novo assembly in arecanut, Areca catechu L elucidates the secondary metabolite pathway genes

    Directory of Open Access Journals (Sweden)

    Ramaswamy Manimekalai

    2018-03-01

    Full Text Available Areca catechu L. belongs to the Arecaceae family which comprises many economically important palms. The palm is a source of alkaloids and carotenoids. The lack of ample genetic information in public databases has been a constraint for the genetic improvement of arecanut. To gain molecular insight into the palm, high throughput RNA sequencing and de novo assembly of arecanut leaf transcriptome was undertaken in the present study. A total 56,321,907 paired end reads of 101 bp length consisting of 11.343 Gb nucleotides were generated. De novo assembly resulted in 48,783 good quality transcripts, of which 67% of transcripts could be annotated against NCBI non – redundant database. The Gene Ontology (GO analysis with UniProt database identified 9222 biological process, 11268 molecular function and 7574 cellular components GO terms. Large scale expression profiling through Fragments per Kilobase per Million mapped reads (FPKM showed major genes involved in different metabolic pathways of the plant. Metabolic pathway analysis of the assembled transcripts identified 124 plant related pathways. The transcripts related to carotenoid and alkaloid biosynthetic pathways had more number of reads and FPKM values suggesting higher expression of these genes. The arecanut transcript sequences generated in the study showed high similarity with coconut, oil palm and date palm sequences retrieved from public domains. We also identified 6853 genic SSR regions in the arecanut. The possible primers were designed for SSR detection and this would simplify the future efforts in genetic characterization of arecanut.

  7. Long-read sequencing and de novo assembly of a Chinese genome

    Science.gov (United States)

    Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arr...

  8. Non PCR-amplified Transcripts and AFLP fragments as reduced representations of the quail genome for 454 Titanium sequencing

    Directory of Open Access Journals (Sweden)

    Leterrier Christine

    2010-07-01

    Full Text Available Abstract Background SNP (Single Nucleotide Polymorphism discovery is now routinely performed using high-throughput sequencing of reduced representation libraries. Our objective was to adapt 454 GS FLX based sequencing methodologies in order to obtain the largest possible dataset from two reduced representations libraries, produced by AFLP (Amplified Fragment Length Polymorphism for genomic DNA, and EST (Expressed Sequence Tag for the transcribed fraction of the genome. Findings The expressed fraction was obtained by preparing cDNA libraries without PCR amplification from quail embryo and brain. To optimize the information content for SNP analyses, libraries were prepared from individuals selected in three quail lines and each individual in the AFLP library was tagged. Sequencing runs produced 399,189 sequence reads from cDNA and 373,484 from genomic fragments, covering close to 250 Mb of sequence in total. Conclusions Both methods used to obtain reduced representations for high-throughput sequencing were successful after several improvements. The protocols may be used for several sequencing applications, such as de novo sequencing, tagged PCR fragments or long fragment sequencing of cDNA.

  9. Sequence context effects on 8-methoxypsoralen photobinding to defined DNA fragments

    International Nuclear Information System (INIS)

    Sage, E.; Moustacchi, E.

    1987-01-01

    The photoreaction of 8-methoxypsoralen (8-MOP) with DNA fragments of defined sequence was studied. The authors took advantage of the blockage by bulky adducts of the 3'-5'-exonuclease activity associated with the T4 DNA polymerase. The action of the exonuclease is stopped by biadducts as well as by monoadducts. The termination products were analyzed on sequencing gels. A strong sequence specificity was observed in the DNA photobinding of 8-MOP. The exonuclease terminates its digestion near thymine residues, mainly at potentially cross-linkable sites. There is an increasing reactivity of thymine residues in the order T < TT << TTT in a GC environment. For thymine residues in cross-linkable sites, the reactivity follows the order AT << TA ∼ TAT << ATA < ATAT < ATATAA. Repeated A-T sequences are hot spots for the photochemical reaction of 8-MOP with DNA. Both monoadducts and interstrand cross-links are formed preferentially in 5'-TpA sites. The results highlight the role of the sequence and consequently of the conformation around a potential site in the photobinding of 8-MOP to DNA

  10. Graph mining for next generation sequencing: leveraging the assembly graph for biological insights.

    Science.gov (United States)

    Warnke-Sommer, Julia; Ali, Hesham

    2016-05-06

    The assembly of Next Generation Sequencing (NGS) reads remains a challenging task. This is especially true for the assembly of metagenomics data that originate from environmental samples potentially containing hundreds to thousands of unique species. The principle objective of current assembly tools is to assemble NGS reads into contiguous stretches of sequence called contigs while maximizing for both accuracy and contig length. The end goal of this process is to produce longer contigs with the major focus being on assembly only. Sequence read assembly is an aggregative process, during which read overlap relationship information is lost as reads are merged into longer sequences or contigs. The assembly graph is information rich and capable of capturing the genomic architecture of an input read data set. We have developed a novel hybrid graph in which nodes represent sequence regions at different levels of granularity. This model, utilized in the assembly and analysis pipeline Focus, presents a concise yet feature rich view of a given input data set, allowing for the extraction of biologically relevant graph structures for graph mining purposes. Focus was used to create hybrid graphs to model metagenomics data sets obtained from the gut microbiomes of five individuals with Crohn's disease and eight healthy individuals. Repetitive and mobile genetic elements are found to be associated with hybrid graph structure. Using graph mining techniques, a comparative study of the Crohn's disease and healthy data sets was conducted with focus on antibiotics resistance genes associated with transposase genes. Results demonstrated significant differences in the phylogenetic distribution of categories of antibiotics resistance genes in the healthy and diseased patients. Focus was also evaluated as a pure assembly tool and produced excellent results when compared against the Meta-velvet, Omega, and UD-IDBA assemblers. Mining the hybrid graph can reveal biological phenomena captured

  11. INTEGRATED APPROACH TO GENERATION OF PRECEDENCE RELATIONS AND PRECEDENCE GRAPHS FOR ASSEMBLY SEQUENCE PLANNING

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    An integrated approach to generation of precedence relations and precedence graphs for assembly sequence planning is presented, which contains more assembly flexibility. The approach involves two stages. Based on the assembly model, the components in the assembly can be divided into partially constrained components and completely constrained components in the first stage, and then geometric precedence relation for every component is generated automatically. According to the result of the first stage, the second stage determines and constructs all precedence graphs. The algorithms of these two stages proposed are verified by two assembly examples.

  12. A novel method to discover fluoroquinolone antibiotic resistance (qnr genes in fragmented nucleotide sequences

    Directory of Open Access Journals (Sweden)

    Boulund Fredrik

    2012-12-01

    Full Text Available Abstract Background Broad-spectrum fluoroquinolone antibiotics are central in modern health care and are used to treat and prevent a wide range of bacterial infections. The recently discovered qnr genes provide a mechanism of resistance with the potential to rapidly spread between bacteria using horizontal gene transfer. As for many antibiotic resistance genes present in pathogens today, qnr genes are hypothesized to originate from environmental bacteria. The vast amount of data generated by shotgun metagenomics can therefore be used to explore the diversity of qnr genes in more detail. Results In this paper we describe a new method to identify qnr genes in nucleotide sequence data. We show, using cross-validation, that the method has a high statistical power of correctly classifying sequences from novel classes of qnr genes, even for fragments as short as 100 nucleotides. Based on sequences from public repositories, the method was able to identify all previously reported plasmid-mediated qnr genes. In addition, several fragments from novel putative qnr genes were identified in metagenomes. The method was also able to annotate 39 chromosomal variants of which 11 have previously not been reported in literature. Conclusions The method described in this paper significantly improves the sensitivity and specificity of identification and annotation of qnr genes in nucleotide sequence data. The predicted novel putative qnr genes in the metagenomic data support the hypothesis of a large and uncharacterized diversity within this family of resistance genes in environmental bacterial communities. An implementation of the method is freely available at http://bioinformatics.math.chalmers.se/qnr/.

  13. Detection of viral sequence fragments of HIV-1 subfamilies yet unknown

    Directory of Open Access Journals (Sweden)

    Stanke Mario

    2011-04-01

    Full Text Available Abstract Background Methods of determining whether or not any particular HIV-1 sequence stems - completely or in part - from some unknown HIV-1 subtype are important for the design of vaccines and molecular detection systems, as well as for epidemiological monitoring. Nevertheless, a single algorithm only, the Branching Index (BI, has been developed for this task so far. Moving along the genome of a query sequence in a sliding window, the BI computes a ratio quantifying how closely the query sequence clusters with a subtype clade. In its current version, however, the BI does not provide predicted boundaries of unknown fragments. Results We have developed Unknown Subtype Finder (USF, an algorithm based on a probabilistic model, which automatically determines which parts of an input sequence originate from a subtype yet unknown. The underlying model is based on a simple profile hidden Markov model (pHMM for each known subtype and an additional pHMM for an unknown subtype. The emission probabilities of the latter are estimated using the emission frequencies of the known subtypes by means of a (position-wise probabilistic model for the emergence of new subtypes. We have applied USF to SIV and HIV-1 sequences formerly classified as having emerged from an unknown subtype. Moreover, we have evaluated its performance on artificial HIV-1 recombinants and non-recombinant HIV-1 sequences. The results have been compared with the corresponding results of the BI. Conclusions Our results demonstrate that USF is suitable for detecting segments in HIV-1 sequences stemming from yet unknown subtypes. Comparing USF with the BI shows that our algorithm performs as good as the BI or better.

  14. Phylogenetic analysis of Gossypium L. using restriction fragment length polymorphism of repeated sequences.

    Science.gov (United States)

    Zhang, Meiping; Rong, Ying; Lee, Mi-Kyung; Zhang, Yang; Stelly, David M; Zhang, Hong-Bin

    2015-10-01

    Cotton is the world's leading textile fiber crop and is also grown as a bioenergy and food crop. Knowledge of the phylogeny of closely related species and the genome origin and evolution of polyploid species is significant for advanced genomics research and breeding. We have reconstructed the phylogeny of the cotton genus, Gossypium L., and deciphered the genome origin and evolution of its five polyploid species by restriction fragment analysis of repeated sequences. Nuclear DNA of 84 accessions representing 35 species and all eight genomes of the genus were analyzed. The phylogenetic tree of the genus was reconstructed using the parsimony method on 1033 polymorphic repeated sequence restriction fragments. The genome origin of its polyploids was determined by calculating the diploid-polyploid restriction fragment correspondence (RFC). The tree is consistent with the morphological classification, genome designation and geographic distribution of the species at subgenus, section and subsection levels. Gossypium lobatum (D7) was unambiguously shown to have the highest RFC with the D-subgenomes of all five polyploids of the genus, while the common ancestor of Gossypium herbaceum (A1) and Gossypium arboreum (A2) likely contributed to the A-subgenomes of the polyploids. These results provide a comprehensive phylogenetic tree of the cotton genus and new insights into the genome origin and evolution of its polyploid species. The results also further demonstrate a simple, rapid and inexpensive method suitable for phylogenetic analysis of closely related species, especially congeneric species, and the inference of genome origin of polyploids that constitute over 70 % of flowering plants.

  15. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high...

  16. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...

  17. Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies.

    Directory of Open Access Journals (Sweden)

    2005-10-01

    Full Text Available With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb in length, 75% were flanked on one or both sides by (often unrelated segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85% semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13% regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22, 13 kb (at 7q11, and 1 kb (at 16q24 fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized.

  18. SEQUENCING AND DE NOVO DRAFT ASSEMBLIES OF A FATHEAD MINNOW (Pimpehales promelas) reference genome

    Data.gov (United States)

    U.S. Environmental Protection Agency — The dataset provides the URLs for accessing the genome sequence data and two draft assemblies as well as fathead minnow genotyping data associated with estimating...

  19. A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

    Science.gov (United States)

    Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

    2017-01-01

    This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.

  20. Reducing assembly complexity of microbial genomes with single-molecule sequencing

    Science.gov (United States)

    Genome assembly algorithms cannot fully reconstruct microbial chromosomes from the DNA reads output by first or second-generation sequencing instruments. Therefore, most genomes are left unfinished due to the significant resources required to manually close gaps left in the draft assemblies. Single-...

  1. Mechanical fragmentation of nuclear reactor fuel assemblies by the double cutting method

    International Nuclear Information System (INIS)

    Voitsekhovskii, B.V.; Istomin, V.L.; Mitrofanov, V.V.

    1995-01-01

    A method is described for cutting a spent fuel assembly with straight shears into pieces of a prescribed size. The method does not require separation of the casing and the lattices. The double cutting method is briefly described, and experiments designed for cutting BN-350 and VVER-440 fuel assemblies are outlined. The testing showed that the cutting method was suitable for mechanical polarization of fuel assemblies. The investigations led to the development of turnkey industrial equipment for cutting spent fuel assemblies of different geometries with a maximum size up to 170 mm. 6 refs., 8 figs., 1 tab

  2. Norgal: Extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data

    DEFF Research Database (Denmark)

    Al-Nakeeb, Kosai Ali Ahmed; Petersen, Thomas Nordahl; Sicheritz-Pontén, Thomas

    2017-01-01

    and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences...

  3. Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data.

    Science.gov (United States)

    Al-Nakeeb, Kosai; Petersen, Thomas Nordahl; Sicheritz-Pontén, Thomas

    2017-11-21

    Whole-genome sequencing (WGS) projects provide short read nucleotide sequences from nuclear and possibly organelle DNA depending on the source of origin. Mitochondrial DNA is present in animals and fungi, while plants contain DNA from both mitochondria and chloroplasts. Current techniques for separating organelle reads from nuclear reads in WGS data require full reference or partial seed sequences for assembling. Norgal (de Novo ORGAneLle extractor) avoids this requirement by identifying a high frequency subset of k-mers that are predominantly of mitochondrial origin and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences in the range from 98.5 to 99.5%. We also assembled the chloroplasts of grape vines and cucumbers using Norgal together with seed-based de novo assemblers. Norgal is a pipeline that can extract and assemble full or partial mitochondrial and chloroplast genomes from WGS short reads without prior knowledge. The program is available at: https://bitbucket.org/kosaidtu/norgal .

  4. NxRepair: error correction in de novo sequence assembly using Nextera mate pairs

    Directory of Open Access Journals (Sweden)

    Rebecca R. Murphy

    2015-06-01

    Full Text Available Scaffolding errors and incorrect repeat disambiguation during de novo assembly can result in large scale misassemblies in draft genomes. Nextera mate pair sequencing data provide additional information to resolve assembly ambiguities during scaffolding. Here, we introduce NxRepair, an open source toolkit for error correction in de novo assemblies that uses Nextera mate pair libraries to identify and correct large-scale errors. We show that NxRepair can identify and correct large scaffolding errors, without use of a reference sequence, resulting in quantitative improvements in the assembly quality. NxRepair can be downloaded from GitHub or PyPI, the Python Package Index; a tutorial and user documentation are also available.

  5. The sequence and de novo assembly of the giant panda genome

    Science.gov (United States)

    Li, Ruiqiang; Fan, Wei; Tian, Geng; Zhu, Hongmei; He, Lin; Cai, Jing; Huang, Quanfei; Cai, Qingle; Li, Bo; Bai, Yinqi; Zhang, Zhihe; Zhang, Yaping; Wang, Wen; Li, Jun; Wei, Fuwen; Li, Heng; Jian, Min; Li, Jianwen; Zhang, Zhaolei; Nielsen, Rasmus; Li, Dawei; Gu, Wanjun; Yang, Zhentao; Xuan, Zhaoling; Ryder, Oliver A.; Leung, Frederick Chi-Ching; Zhou, Yan; Cao, Jianjun; Sun, Xiao; Fu, Yonggui; Fang, Xiaodong; Guo, Xiaosen; Wang, Bo; Hou, Rong; Shen, Fujun; Mu, Bo; Ni, Peixiang; Lin, Runmao; Qian, Wubin; Wang, Guodong; Yu, Chang; Nie, Wenhui; Wang, Jinhuan; Wu, Zhigang; Liang, Huiqing; Min, Jiumeng; Wu, Qi; Cheng, Shifeng; Ruan, Jue; Wang, Mingwei; Shi, Zhongbin; Wen, Ming; Liu, Binghang; Ren, Xiaoli; Zheng, Huisong; Dong, Dong; Cook, Kathleen; Shan, Gao; Zhang, Hao; Kosiol, Carolin; Xie, Xueying; Lu, Zuhong; Zheng, Hancheng; Li, Yingrui; Steiner, Cynthia C.; Lam, Tommy Tsan-Yuk; Lin, Siyuan; Zhang, Qinghui; Li, Guoqing; Tian, Jing; Gong, Timing; Liu, Hongde; Zhang, Dejin; Fang, Lin; Ye, Chen; Zhang, Juanbin; Hu, Wenbo; Xu, Anlong; Ren, Yuanyuan; Zhang, Guojie; Bruford, Michael W.; Li, Qibin; Ma, Lijia; Guo, Yiran; An, Na; Hu, Yujie; Zheng, Yang; Shi, Yongyong; Li, Zhiqiang; Liu, Qing; Chen, Yanling; Zhao, Jing; Qu, Ning; Zhao, Shancen; Tian, Feng; Wang, Xiaoling; Wang, Haiyin; Xu, Lizhi; Liu, Xiao; Vinar, Tomas; Wang, Yajun; Lam, Tak-Wah; Yiu, Siu-Ming; Liu, Shiping; Zhang, Hemin; Li, Desheng; Huang, Yan; Wang, Xia; Yang, Guohua; Jiang, Zhi; Wang, Junyi; Qin, Nan; Li, Li; Li, Jingxiang; Bolund, Lars; Kristiansen, Karsten; Wong, Gane Ka-Shu; Olson, Maynard; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian; Wang, Jun

    2013-01-01

    Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes. PMID:20010809

  6. RePS: a sequence assembler that masks exact repeats identified from the shotgun data

    DEFF Research Database (Denmark)

    Wang, Jun; Wong, Gane Ka-Shu; Ni, Peixiang

    2002-01-01

    We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone......-end-pairing information is used to construct scaffolds that order and orient the contigs. We show with real data for human and rice that reasonable assemblies are possible even at coverages of only 4x to 6x, despite having up to 42.2% in exact repeats. Udgivelsesdato: 2002-May...

  7. Tips and tricks for the assembly of a Corynebacterium pseudotuberculosis genome using a semiconductor sequencer

    DEFF Research Database (Denmark)

    Ramos, Rommel Thiago Jucá; Carneiro, Adriana Ribeiro; Soares, Siomar de Castro

    2013-01-01

    that enable reference-based assembly, such as the one used in the present study, Corynebacterium pseudotuberculosis biovar equi, which causes high economic losses in the US equine industry. The quality treatment strategy incorporated into the assembly pipeline enabled a 16-fold greater use of the sequencing...... was validated by comparative genomics with other species of the genus Corynebacterium. The present study presents a modus operandi that enables a greater and better use of data obtained from semiconductor sequencing for obtaining the complete genome from a prokaryotic microorganism, C. pseudotuberculosis, which...

  8. Genotypic characterization of Salmonella by multilocus sequence typing, pulsed-field gel electrophoresis and amplified fragment length polymorphism

    DEFF Research Database (Denmark)

    Torpdahl, Mia; Skov, Marianne N.; Sandvang, Dorthe

    2005-01-01

    subspecies enterica isolates. A total of 25 serotypes were investigated that had been isolated from humans or veterinary sources in Denmark between 1995 and 2001. All isolates were genotyped by multilocus sequence typing (MLST), pulsed-field gel electrophoresis (PFGE) and amplified fragment length...

  9. The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads.

    Science.gov (United States)

    Wang, Zhiwen; Hobson, Neil; Galindo, Leonardo; Zhu, Shilin; Shi, Daihu; McDill, Joshua; Yang, Linfeng; Hawkins, Simon; Neutelings, Godfrey; Datla, Raju; Lambert, Georgina; Galbraith, David W; Grassa, Christopher J; Geraldes, Armando; Cronk, Quentin C; Cullis, Christopher; Dash, Prasanta K; Kumar, Polumetla A; Cloutier, Sylvie; Sharpe, Andrew G; Wong, Gane K-S; Wang, Jun; Deyholos, Michael K

    2012-11-01

    Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N(50) =694 kb, including contigs with N(50)=20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K(s) ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.

  10. Fragment-derived inhibitors of human N-myristoyltransferase block capsid assembly and replication of the common cold virus

    Science.gov (United States)

    Mousnier, Aurélie; Bell, Andrew S.; Swieboda, Dawid P.; Morales-Sanfrutos, Julia; Pérez-Dorado, Inmaculada; Brannigan, James A.; Newman, Joseph; Ritzefeld, Markus; Hutton, Jennie A.; Guedán, Anabel; Asfor, Amin S.; Robinson, Sean W.; Hopkins-Navratilova, Iva; Wilkinson, Anthony J.; Johnston, Sebastian L.; Leatherbarrow, Robin J.; Tuthill, Tobias J.; Solari, Roberto; Tate, Edward W.

    2018-06-01

    Rhinoviruses (RVs) are the pathogens most often responsible for the common cold, and are a frequent cause of exacerbations in asthma, chronic obstructive pulmonary disease and cystic fibrosis. Here we report the discovery of IMP-1088, a picomolar dual inhibitor of the human N-myristoyltransferases NMT1 and NMT2, and use it to demonstrate that pharmacological inhibition of host-cell N-myristoylation rapidly and completely prevents rhinoviral replication without inducing cytotoxicity. The identification of cooperative binding between weak-binding fragments led to rapid inhibitor optimization through fragment reconstruction, structure-guided fragment linking and conformational control over linker geometry. We show that inhibition of the co-translational myristoylation of a specific virus-encoded protein (VP0) by IMP-1088 potently blocks a key step in viral capsid assembly, to deliver a low nanomolar antiviral activity against multiple RV strains, poliovirus and foot and-mouth disease virus, and protection of cells against virus-induced killing, highlighting the potential of host myristoylation as a drug target in picornaviral infections.

  11. New tool to assemble repetitive regions using next-generation sequencing data

    Science.gov (United States)

    Kuśmirek, Wiktor; Nowak, Robert M.; Neumann, Łukasz

    2017-08-01

    The next generation sequencing techniques produce a large amount of sequencing data. Some part of the genome are composed of repetitive DNA sequences, which are very problematic for the existing genome assemblers. We propose a modification of the algorithm for a DNA assembly, which uses the relative frequency of reads to properly reconstruct repetitive sequences. The new approach was implemented and tested, as a demonstration of the capability of our software we present some results for model organisms. The new implementation, using a three-layer software architecture was selected, where the presentation layer, data processing layer, and data storage layer were kept separate. Source code as well as demo application with web interface and the additional data are available at project web-page: http://dnaasm.sourceforge.net.

  12. Assembly of Highly Standardized Gene Fragments for High-Level Production of Porphyrins in E. coli

    DEFF Research Database (Denmark)

    Nielsen, Morten Thrane; Madsen, Karina Marie; Seppala, Susanna

    2015-01-01

    to formulate a molecular cloning pipeline and iteratively assemble and optimize a six-gene pathway for protoporphyrin IX synthesis in Escherichia coli. State of the art production levels were achieved through two simple cycles of engineering and screening. The principles defined here are generally applicable...

  13. When less is more: 'slicing' sequencing data improves read decoding accuracy and de novo assembly quality.

    Science.gov (United States)

    Lonardi, Stefano; Mirebrahim, Hamid; Wanamaker, Steve; Alpert, Matthew; Ciardo, Gianfranco; Duma, Denisa; Close, Timothy J

    2015-09-15

    As the invention of DNA sequencing in the 70s, computational biologists have had to deal with the problem of de novo genome assembly with limited (or insufficient) depth of sequencing. In this work, we investigate the opposite problem, that is, the challenge of dealing with excessive depth of sequencing. We explore the effect of ultra-deep sequencing data in two domains: (i) the problem of decoding reads to bacterial artificial chromosome (BAC) clones (in the context of the combinatorial pooling design we have recently proposed), and (ii) the problem of de novo assembly of BAC clones. Using real ultra-deep sequencing data, we show that when the depth of sequencing increases over a certain threshold, sequencing errors make these two problems harder and harder (instead of easier, as one would expect with error-free data), and as a consequence the quality of the solution degrades with more and more data. For the first problem, we propose an effective solution based on 'divide and conquer': we 'slice' a large dataset into smaller samples of optimal size, decode each slice independently, and then merge the results. Experimental results on over 15 000 barley BACs and over 4000 cowpea BACs demonstrate a significant improvement in the quality of the decoding and the final assembly. For the second problem, we show for the first time that modern de novo assemblers cannot take advantage of ultra-deep sequencing data. Python scripts to process slices and resolve decoding conflicts are available from http://goo.gl/YXgdHT; software Hashfilter can be downloaded from http://goo.gl/MIyZHs stelo@cs.ucr.edu or timothy.close@ucr.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Sequencing and De novo Draft Assemblies of the Fathead Minnow (Pimphales promelas)Reference Genome

    Science.gov (United States)

    This study was undertaken to develop genome-scale resources for the fathead minnow (Pimphales promelas) an important model organism widely used in both aquatic ecotoxicology research and in regulatory toxicity testing. We report on the first sequencing and two draft assemblies fo...

  15. fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

    National Research Council Canada - National Science Library

    Rangwala, Huzefa; Karypis, George

    2007-01-01

    .... We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein...

  16. Colloidal polymers with controlled sequence and branching constructed from magnetic field assembled nanoparticles.

    Science.gov (United States)

    Bannwarth, Markus B; Utech, Stefanie; Ebert, Sandro; Weitz, David A; Crespy, Daniel; Landfester, Katharina

    2015-03-24

    The assembly of nanoparticles into polymer-like architectures is challenging and usually requires highly defined colloidal building blocks. Here, we show that the broad size-distribution of a simple dispersion of magnetic nanocolloids can be exploited to obtain various polymer-like architectures. The particles are assembled under an external magnetic field and permanently linked by thermal sintering. The remarkable variety of polymer-analogue architectures that arises from this simple process ranges from statistical and block copolymer-like sequencing to branched chains and networks. This library of architectures can be realized by controlling the sequencing of the particles and the junction points via a size-dependent self-assembly of the single building blocks.

  17. Coevolutionary constraints in the sequence-space of macromolecular complexes reflect their self-assembly pathways.

    Science.gov (United States)

    Mallik, Saurav; Kundu, Sudip

    2017-07-01

    Is the order in which biomolecular subunits self-assemble into functional macromolecular complexes imprinted in their sequence-space? Here, we demonstrate that the temporal order of macromolecular complex self-assembly can be efficiently captured using the landscape of residue-level coevolutionary constraints. This predictive power of coevolutionary constraints is irrespective of the structural, functional, and phylogenetic classification of the complex and of the stoichiometry and quaternary arrangement of the constituent monomers. Combining this result with a number of structural attributes estimated from the crystal structure data, we find indications that stronger coevolutionary constraints at interfaces formed early in the assembly hierarchy probably promotes coordinated fixation of mutations that leads to high-affinity binding with higher surface area, increased surface complementarity and elevated number of molecular contacts, compared to those that form late in the assembly. Proteins 2017; 85:1183-1189. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  18. Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery

    Directory of Open Access Journals (Sweden)

    Benkman Craig W

    2010-03-01

    Full Text Available Abstract Background Massively parallel sequencing of cDNA is now an efficient route for generating enormous sequence collections that represent expressed genes. This approach provides a valuable starting point for characterizing functional genetic variation in non-model organisms, especially where whole genome sequencing efforts are currently cost and time prohibitive. The large and complex genomes of pines (Pinus spp. have hindered the development of genomic resources, despite the ecological and economical importance of the group. While most genomic studies have focused on a single species (P. taeda, genomic level resources for other pines are insufficiently developed to facilitate ecological genomic research. Lodgepole pine (P. contorta is an ecologically important foundation species of montane forest ecosystems and exhibits substantial adaptive variation across its range in western North America. Here we describe a sequencing study of expressed genes from P. contorta, including their assembly and annotation, and their potential for molecular marker development to support population and association genetic studies. Results We obtained 586,732 sequencing reads from a 454 GS XLR70 Titanium pyrosequencer (mean length: 306 base pairs. A combination of reference-based and de novo assemblies yielded 63,657 contigs, with 239,793 reads remaining as singletons. Based on sequence similarity with known proteins, these sequences represent approximately 17,000 unique genes, many of which are well covered by contig sequences. This sequence collection also included a surprisingly large number of retrotransposon sequences, suggesting that they are highly transcriptionally active in the tissues we sampled. We located and characterized thousands of simple sequence repeats and single nucleotide polymorphisms as potential molecular markers in our assembled and annotated sequences. High quality PCR primers were designed for a substantial number of the SSR loci

  19. IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS

    Science.gov (United States)

    Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...

  20. Programming molecular self-assembly of intrinsically disordered proteins containing sequences of low complexity

    Science.gov (United States)

    Simon, Joseph R.; Carroll, Nick J.; Rubinstein, Michael; Chilkoti, Ashutosh; López, Gabriel P.

    2017-06-01

    Dynamic protein-rich intracellular structures that contain phase-separated intrinsically disordered proteins (IDPs) composed of sequences of low complexity (SLC) have been shown to serve a variety of important cellular functions, which include signalling, compartmentalization and stabilization. However, our understanding of these structures and our ability to synthesize models of them have been limited. We present design rules for IDPs possessing SLCs that phase separate into diverse assemblies within droplet microenvironments. Using theoretical analyses, we interpret the phase behaviour of archetypal IDP sequences and demonstrate the rational design of a vast library of multicomponent protein-rich structures that ranges from uniform nano-, meso- and microscale puncta (distinct protein droplets) to multilayered orthogonally phase-separated granular structures. The ability to predict and program IDP-rich assemblies in this fashion offers new insights into (1) genetic-to-molecular-to-macroscale relationships that encode hierarchical IDP assemblies, (2) design rules of such assemblies in cell biology and (3) molecular-level engineering of self-assembled recombinant IDP-rich materials.

  1. De novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing.

    Science.gov (United States)

    Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

    2012-01-01

    Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple.

  2. MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    Energy Technology Data Exchange (ETDEWEB)

    Sakakibara, Yasumbumi

    2011-10-13

    Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  3. Assembly and melting of DNA nanotubes from single-sequence tiles

    International Nuclear Information System (INIS)

    Sobey, T L; Renner, S; Simmel, F C

    2009-01-01

    DNA melting and renaturation studies are an extremely valuable tool to study the kinetics and thermodynamics of duplex dissociation and reassociation reactions. These are important not only in a biological or biotechnological context, but also for DNA nanotechnology which aims at the construction of molecular materials by DNA self-assembly. We here study experimentally the formation and melting of a DNA nanotube structure, which is composed of many copies of an oligonucleotide containing several palindromic sequences. This is done using temperature-controlled UV absorption measurements correlated with atomic force microscopy, fluorescence microscopy and transmission electron microscopy techniques. In the melting studies, important factors such as DNA strand concentration, hierarchy of assembly and annealing protocol are investigated. Assembly and melting of the nanotubes are shown to proceed via different pathways. Whereas assembly occurs in several hierarchical steps related to the formation of tiles, lattices and tubes, melting of DNA nanotubes appears to occur in a single step. This is proposed to relate to fundamental differences between closed, three-dimensional tube-like structures and open, two-dimensional lattices. DNA melting studies can lead to a better understanding of the many factors that affect the assembly process which will be essential for the assembly of increasingly complex DNA nanostructures.

  4. Characterization of Liaoning cashmere goat transcriptome: sequencing, de novo assembly, functional annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Hongliang Liu

    Full Text Available Liaoning cashmere goat is a famous goat breed for cashmere wool. In order to increase the transcriptome data and accelerate genetic improvement for this breed, we performed de novo transcriptome sequencing to generate the first expressed sequence tag dataset for the Liaoning cashmere goat, using next-generation sequencing technology.Transcriptome sequencing of Liaoning cashmere goat on a Roche 454 platform yielded 804,601 high-quality reads. Clustering and assembly of these reads produced a non-redundant set of 117,854 unigenes, comprising 13,194 isotigs and 104,660 singletons. Based on similarity searches with known proteins, 17,356 unigenes were assigned to 6,700 GO categories, and the terms were summarized into three main GO categories and 59 sub-categories. 3,548 and 46,778 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Comparative analysis revealed that 42,254 unigenes were aligned to 17,532 different sequences in NCBI non-redundant nucleotide databases. 97,236 (82.51% unigenes were mapped to the 30 goat chromosomes. 35,551 (30.17% unigenes were matched to 11,438 reported goat protein-coding genes. The remaining non-matched unigenes were further compared with cattle and human reference genes, 67 putative new goat genes were discovered. Additionally, 2,781 potential simple sequence repeats were initially identified from all unigenes.The transcriptome of Liaoning cashmere goat was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the Liaoning cashmere goat transcriptome. The potential simple sequence repeats provide a material basis for future genetic linkage and quantitative trait loci analyses.

  5. Improving transcriptome assembly through error correction of high-throughput sequence reads

    Directory of Open Access Journals (Sweden)

    Matthew D. MacManes

    2013-07-01

    Full Text Available The study of functional genomics, particularly in non-model organisms, has been dramatically improved over the last few years by the use of transcriptomes and RNAseq. While these studies are potentially extremely powerful, a computationally intensive procedure, the de novo construction of a reference transcriptome must be completed as a prerequisite to further analyses. The accurate reference is critically important as all downstream steps, including estimating transcript abundance are critically dependent on the construction of an accurate reference. Though a substantial amount of research has been done on assembly, only recently have the pre-assembly procedures been studied in detail. Specifically, several stand-alone error correction modules have been reported on and, while they have shown to be effective in reducing errors at the level of sequencing reads, how error correction impacts assembly accuracy is largely unknown. Here, we show via use of a simulated and empiric dataset, that applying error correction to sequencing reads has significant positive effects on assembly accuracy, and should be applied to all datasets. A complete collection of commands which will allow for the production of Reptile corrected reads is available at https://github.com/macmanes/error_correction/tree/master/scripts and as File S1.

  6. Genetic alterations of hepatocellular carcinoma by random amplified polymorphic DNA analysis and cloning sequencing of tumor differential DNA fragment

    Science.gov (United States)

    Xian, Zhi-Hong; Cong, Wen-Ming; Zhang, Shu-Hui; Wu, Meng-Chao

    2005-01-01

    AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments. METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD) with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated, purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data. RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size, histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene. CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcin-ogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis. PMID:15996039

  7. Virtual Genome Walking across the 32 Gb Ambystoma mexicanum genome; assembling gene models and intronic sequence.

    Science.gov (United States)

    Evans, Teri; Johnson, Andrew D; Loose, Matthew

    2018-01-12

    Large repeat rich genomes present challenges for assembly using short read technologies. The 32 Gb axolotl genome is estimated to contain ~19 Gb of repetitive DNA making an assembly from short reads alone effectively impossible. Indeed, this model species has been sequenced to 20× coverage but the reads could not be conventionally assembled. Using an alternative strategy, we have assembled subsets of these reads into scaffolds describing over 19,000 gene models. We call this method Virtual Genome Walking as it locally assembles whole genome reads based on a reference transcriptome, identifying exons and iteratively extending them into surrounding genomic sequence. These assemblies are then linked and refined to generate gene models including upstream and downstream genomic, and intronic, sequence. Our assemblies are validated by comparison with previously published axolotl bacterial artificial chromosome (BAC) sequences. Our analyses of axolotl intron length, intron-exon structure, repeat content and synteny provide novel insights into the genic structure of this model species. This resource will enable new experimental approaches in axolotl, such as ChIP-Seq and CRISPR and aid in future whole genome sequencing efforts. The assembled sequences and annotations presented here are freely available for download from https://tinyurl.com/y8gydc6n . The software pipeline is available from https://github.com/LooseLab/iterassemble .

  8. Multi-objective Analysis for a Sequencing Planning of Mixed-model Assembly Line

    Science.gov (United States)

    Shimizu, Yoshiaki; Waki, Toshiya; Yoo, Jae Kyu

    Diversified customer demands are raising importance of just-in-time and agile manufacturing much more than before. Accordingly, introduction of mixed-model assembly lines becomes popular to realize the small-lot-multi-kinds production. Since it produces various kinds on the same assembly line, a rational management is of special importance. With this point of view, this study focuses on a sequencing problem of mixed-model assembly line including a paint line as its preceding process. By taking into account the paint line together, reducing work-in-process (WIP) inventory between these heterogeneous lines becomes a major concern of the sequencing problem besides improving production efficiency. Finally, we have formulated the sequencing problem as a bi-objective optimization problem to prevent various line stoppages, and to reduce the volume of WIP inventory simultaneously. Then we have proposed a practical method for the multi-objective analysis. For this purpose, we applied the weighting method to derive the Pareto front. Actually, the resulting problem is solved by a meta-heuristic method like SA (Simulated Annealing). Through numerical experiments, we verified the validity of the proposed approach, and discussed the significance of trade-off analysis between the conflicting objectives.

  9. Orthology Guided Assembly in highly heterozygous crops

    DEFF Research Database (Denmark)

    Ruttink, Tom; Sterck, Lieven; Rohde, Antje

    2013-01-01

    to outbreeding crop species hamper De Bruijn Graph-based de novo assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary de novo assembly is best performed per genotype to limit the level......Despite current advances in next-generation sequencing data analysis procedures, de novo assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent...... of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene-by-gene basis. Thus, we...

  10. Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes

    Energy Technology Data Exchange (ETDEWEB)

    White, Richard Allen; Bottos, Eric M.; Roy Chowdhury, Taniya; Zucker, Jeremy D.; Brislawn, Colin J.; Nicora, Carrie D.; Fansler, Sarah J.; Glaesemann, Kurt R.; Glass, Kevin; Jansson, Janet K.; Langille, Morgan

    2016-06-28

    ABSTRACT

    Soil metagenomics has been touted as the “grand challenge” for metagenomics, as the high microbial diversity and spatial heterogeneity of soils make them unamenable to current assembly platforms. Here, we aimed to improve soil metagenomic sequence assembly by applying the Moleculo synthetic long-read sequencing technology. In total, we obtained 267 Gbp of raw sequence data from a native prairie soil; these data included 109.7 Gbp of short-read data (~100 bp) from the Joint Genome Institute (JGI), an additional 87.7 Gbp of rapid-mode read data (~250 bp), plus 69.6 Gbp (>1.5 kbp) from Moleculo sequencing. The Moleculo data alone yielded over 5,600 reads of >10 kbp in length, and over 95% of the unassembled reads mapped to contigs of >1.5 kbp. Hybrid assembly of all data resulted in more than 10,000 contigs over 10 kbp in length. We mapped three replicate metatranscriptomes derived from the same parent soil to the Moleculo subassembly and found that 95% of the predicted genes, based on their assignments to Enzyme Commission (EC) numbers, were expressed. The Moleculo subassembly also enabled binning of >100 microbial genome bins. We obtained via direct binning the first complete genome, that of “CandidatusPseudomonas sp. strain JKJ-1” from a native soil metagenome. By mapping metatranscriptome sequence reads back to the bins, we found that several bins corresponding to low-relative-abundanceAcidobacteriawere highly transcriptionally active, whereas bins corresponding to high-relative-abundanceVerrucomicrobiawere not. These results demonstrate that Moleculo sequencing provides a significant advance for resolving complex soil microbial communities.

    IMPORTANCESoil microorganisms carry out key processes for life on our planet, including cycling of carbon and other nutrients and supporting growth of plants. However, there is poor molecular-level understanding of their

  11. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth.

    Science.gov (United States)

    Peng, Yu; Leung, Henry C M; Yiu, S M; Chin, Francis Y L

    2012-06-01

    Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/~alse/idba_ud

  12. Two-Stage orders sequencing system for mixed-model assembly

    Science.gov (United States)

    Zemczak, M.; Skolud, B.; Krenczyk, D.

    2015-11-01

    In the paper, the authors focus on the NP-hard problem of orders sequencing, formulated similarly to Car Sequencing Problem (CSP). The object of the research is the assembly line in an automotive industry company, on which few different models of products, each in a certain number of versions, are assembled on the shared resources, set in a line. Such production type is usually determined as a mixed-model production, and arose from the necessity of manufacturing customized products on the basis of very specific orders from single clients. The producers are nowadays obliged to provide each client the possibility to determine a huge amount of the features of the product they are willing to buy, as the competition in the automotive market is large. Due to the previously mentioned nature of the problem (NP-hard), in the given time period only satisfactory solutions are sought, as the optimal solution method has not yet been found. Most of the researchers that implemented inaccurate methods (e.g. evolutionary algorithms) to solving sequencing problems dropped the research after testing phase, as they were not able to obtain reproducible results, and met problems while determining the quality of the received solutions. Therefore a new approach to solving the problem, presented in this paper as a sequencing system is being developed. The sequencing system consists of a set of determined rules, implemented into computer environment. The system itself works in two stages. First of them is connected with the determination of a place in the storage buffer to which certain production orders should be sent. In the second stage of functioning, precise sets of sequences are determined and evaluated for certain parts of the storage buffer under certain criteria.

  13. De novo sequencing, assembly and characterization of antennal transcriptome of Anomala corpulenta Motschulsky (Coleoptera: Rutelidae.

    Directory of Open Access Journals (Sweden)

    Haoliang Chen

    Full Text Available Anomala corpulenta is an important insect pest and can cause enormous economic losses in agriculture, horticulture and forestry. It is widely distributed in China, and both larvae and adults can cause serious damage. It is difficult to control this pest because the larvae live underground. Any new control strategy should exploit alternatives to heavily and frequently used chemical insecticides. However, little genetic research has been carried out on A. corpulenta due to the lack of genomic resources. Genomic resources could be produced by next generation sequencing technologies with low cost and in a short time. In this study, we performed de novo sequencing, assembly and characterization of the antennal transcriptome of A. corpulenta.Illumina sequencing technology was used to sequence the antennal transcriptome of A. corpulenta. Approximately 76.7 million total raw reads and about 68.9 million total clean reads were obtained, and then 35,656 unigenes were assembled. Of these unigenes, 21,463 of them could be annotated in the NCBI nr database, and, among the annotated unigenes, 11,154 and 6,625 unigenes could be assigned to GO and COG, respectively. Additionally, 16,350 unigenes could be annotated in the Swiss-Prot database, and 14,499 unigenes could map onto 258 pathways in the KEGG Pathway database. We also found 24 unigenes related to OBPs, 6 to CSPs, and in total 167 unigenes related to chemodetection. We analyzed 4 OBPs and 3CSPs sequences and their RT-qPCR results agreed well with their FPKM values.We produced the first large-scale antennal transcriptome of A. corpulenta, which is a species that has little genomic information in public databases. The identified chemodetection unigenes can promote the molecular mechanistic study of behavior in A. corpulenta. These findings provide a general sequence resource for molecular genetics research on A. corpulenta.

  14. Self-assembly of block copolymer micelles: synthesis via reversible addition-fragmentation chain transfer polymerization and aqueous solution properties.

    Science.gov (United States)

    Mya, Khine Y; Lin, Esther M J; Gudipati, Chakravarthy S; Gose, Halima B A S; He, Chaobin

    2010-07-22

    Poly(hexafluorobutyl methacrylate) (PHFBMA) homopolymer was synthesized by reversible addition-fragmentation chain transfer (RAFT)-mediated living radical polymerization in the presence of cyano-2-propyl dithiobenzoate (CPDB) RAFT agent. A block copolymer of PHFBMA-poly(propylene glycol acrylate) (PHFBMA-b-PPGA) with dangling poly(propylene glycol) (PPG) side chains was then synthesized by using CPDB-terminated PHFBMA as a macro-RAFT agent. The amphiphilic properties and self-assembly of PHFBMA-b-PPGA block copolymer in aqueous solution were investigated by dynamic and static light scattering (DLS and SLS) studies, in combination with fluorescence spectroscopy and transmission electron microscopy (TEM). Although PPG shows moderately hydrophilic character, the formation of nanosize polymeric micelles was confirmed by fluorescence and TEM studies. The low value of the critical aggregation concentration exhibited that the tendency for the formation of copolymer aggregates in aqueous solution was very high due to the strong hydrophobicity of the PHFBMA(145)-b-PPGA(33) block copolymer. The combination of DLS and SLS measurements revealed the existence of micellar aggregates in aqueous solution with an association number of approximately 40 +/- 7 for block copolymer micelles. It was also found in TEM observation that there are 40-50 micelles accumulated into one aggregate and these micelles are loosely packed inside the aggregate.

  15. Binning of shallowly sampled metagenomic sequence fragments reveals that low abundance bacteria play important roles in sulfur cycling and degradation of complex organic polymers in an acid mine drainage community

    Science.gov (United States)

    Dick, G. J.; Andersson, A.; Banfield, J. F.

    2007-12-01

    Our understanding of environmental microbiology has been greatly enhanced by community genome sequencing of DNA recovered directly the environment. Community genomics provides insights into the diversity, community structure, metabolic function, and evolution of natural populations of uncultivated microbes, thereby revealing dynamics of how microorganisms interact with each other and their environment. Recent studies have demonstrated the potential for reconstructing near-complete genomes from natural environments while highlighting the challenges of analyzing community genomic sequence, especially from diverse environments. A major challenge of shotgun community genome sequencing is identification of DNA fragments from minor community members for which only low coverage of genomic sequence is present. We analyzed community genome sequence retrieved from biofilms in an acid mine drainage (AMD) system in the Richmond Mine at Iron Mountain, CA, with an emphasis on identification and assembly of DNA fragments from low-abundance community members. The Richmond mine hosts an extensive, relatively low diversity subterranean chemolithoautotrophic community that is sustained entirely by oxidative dissolution of pyrite. The activity of these microorganisms greatly accelerates the generation of AMD. Previous and ongoing work in our laboratory has focused on reconstrucing genomes of dominant community members, including several bacteria and archaea. We binned contigs from several samples (including one new sample and two that had been previously analyzed) by tetranucleotide frequency with clustering by Self-Organizing Maps (SOM). The binning, evaluated by comparison with information from the manually curated assembly of the dominant organisms, was found to be very effective: fragments were correctly assigned with 95% accuracy. Improperly assigned fragments often contained sequences that are either evolutionarily constrained (e.g. 16S rRNA genes) or mobile elements that are

  16. Draft Genome Sequence of a “Candidatus Liberibacter europaeus” Strain Assembled from Broom Psyllids (Arytainilla spartiophila) from New Zealand

    Science.gov (United States)

    Thompson, Sarah M.; Kalamorz, Falk; David, Charles; Addison, Shea M.; Smith, Grant R.

    2018-01-01

    ABSTRACT Here, we report the draft genome sequence of “Candidatus Liberibacter europaeus” ASNZ1, assembled from broom psyllids (Arytainilla spartiophila) from New Zealand. The assembly comprises 15 contigs, with a total length of 1.33 Mb and a G+C content of 33.5%. PMID:29773636

  17. De novo transcriptome sequencing and assembly from apomictic and sexual Eragrostis curvula genotypes.

    Directory of Open Access Journals (Sweden)

    Ingrid Garbus

    Full Text Available A long-standing goal in plant breeding has been the ability to confer apomixis to agriculturally relevant species, which would require a deeper comprehension of the molecular basis of apomictic regulatory mechanisms. Eragrostis curvula (Schrad. Nees is a perennial grass that includes both sexual and apomictic cytotypes. The availability of a reference transcriptome for this species would constitute a very important tool toward the identification of genes controlling key steps of the apomictic pathway. Here, we used Roche/454 sequencing technologies to generate reads from inflorescences of E. curvula apomictic and sexual genotypes that were de novo assembled into a reference transcriptome. Near 90% of the 49568 assembled isotigs showed sequence similarity to sequences deposited in the public databases. A gene ontology analysis categorized 27448 isotigs into at least one of the three main GO categories. We identified 11475 SSRs, and several of them were assayed in E curvula germoplasm using SSR-based primers, providing a valuable set of molecular markers that could allow direct allele selection. The differential contribution to each library of the spliced forms of several transcripts revealed the existence of several isotigs produced via alternative splicing of single genes. The reference transcriptome presented and validated in this work will be useful for the identification of a wide range of gene(s related to agronomic traits of E. curvula, including those controlling key steps of the apomictic pathway in this species, allowing the extrapolation of the findings to other plant species.

  18. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

    Science.gov (United States)

    Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2016-07-01

    The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  19. Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities.

    Science.gov (United States)

    Taghavi, Zeinab; Movahedi, Narjes S; Draghici, Sorin; Chitsaz, Hamidreza

    2013-10-01

    Identification of every single genome present in a microbial sample is an important and challenging task with crucial applications. It is challenging because there are typically millions of cells in a microbial sample, the vast majority of which elude cultivation. The most accurate method to date is exhaustive single-cell sequencing using multiple displacement amplification, which is simply intractable for a large number of cells. However, there is hope for breaking this barrier, as the number of different cell types with distinct genome sequences is usually much smaller than the number of cells. Here, we present a novel divide and conquer method to sequence and de novo assemble all distinct genomes present in a microbial sample with a sequencing cost and computational complexity proportional to the number of genome types, rather than the number of cells. The method is implemented in a tool called Squeezambler. We evaluated Squeezambler on simulated data. The proposed divide and conquer method successfully reduces the cost of sequencing in comparison with the naïve exhaustive approach. Squeezambler and datasets are available at http://compbio.cs.wayne.edu/software/squeezambler/.

  20. De novo assembly and characterization of the spleen transcriptome of common carp (Cyprinus carpio) using Illumina paired-end sequencing.

    Science.gov (United States)

    Li, Guoxi; Zhao, Yinli; Liu, Zhonghu; Gao, Chunsheng; Yan, Fengbin; Liu, Bianzhi; Feng, Jianxin

    2015-06-01

    Common carp (Cyprinus carpio) is one of the most important aquacultured species of the family Cyprinidae, and breeding this species for disease resistance is becoming more and more important. However, at the genome or transcriptome levels, study of the immunogenetics of disease resistance in the common carp is lacking. In this study, 60,316,906 and 75,200,328 paired-end clean reads were obtained from two cDNA libraries of the common carp spleen by Illumina paired-end sequencing technology. Totally, 130,293 unique transcript fragments (unigenes) were assembled, with an average length of 1400.57 bp. Approximately 105,612 (81.06%) unigenes could be annotated according to their homology with matches in the Nr, Nt, Swiss-Prot, COG, GO, or KEGG databases, and they were found to represent 46,747 non-redundant genes. Comparative analysis showed that 59.82% of the unigenes have significant similarity to zebrafish Refseq proteins. Gene expression comparison revealed that 10,432 and 6889 annotated unigenes were, respectively, up- and down-regulated with at least twofold changes between two developmental stages of the common carp spleen. Gene ontology and KEGG analysis were performed to classify all unigenes into functional categories for understanding gene functions and regulation pathways. In addition, 46,847 simple sequence repeats (SSRs) were detected from 35,618 unigenes, and a large number of single nucleotide polymorphism (SNP) and insertion/deletion (INDEL) sites were identified in the spleen transcriptome of common carp. This study has characterized the spleen transcriptome of the common carp for the first time, providing a valuable resource for a better understanding of the common carp immune system and defense mechanisms. This knowledge will also facilitate future functional studies on common carp immunogenetics that may eventually be applied in breeding programs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Construction and sequencing analysis of scFv antibody fragment derived from monoclonal antibody against norfloxacin (Nor155

    Directory of Open Access Journals (Sweden)

    J. Mala

    2017-06-01

    Full Text Available Norfloxacin belongs to the group of fluoroquinolone antibiotics which has been approved for treatment in animals. However, its residues in animal products can pose adverse side effects to consumer. Therefore, detection of the residue in different food matrices must be concerned. In this study, a single chain variable fragment (scFv that recognizes norfloxacin antibiotic was constructed. The cDNA was synthesized from total RNA of hybridoma cells against norfloxacin. Genes encoding VH and VL regions of monoclonal antibody against norfloxacin (Nor155 were amplified and size of VH and VL fragments was 402 bp and 363 bp, respectively. The scFv of Nor155 was constructed by an addition of (Gly4Ser3 as a linker between VH and VL regions and subcloned into pPICZαA, an expression vector of Pichia pastoris. The sequence of scFv Nor155 (GenBank No. AJG06891.1 was confirmed by sequencing analysis. The complementarity determining regions (CDR I, II, and III of VH and VL were specified by Kabat method. The obtained recombinant plasmid will be useful for production of scFv antibody against norfloxacin in P. pastoris and further engineer scFv antibody against fluoroquinolone antibiotics.

  2. Sequencing and de novo assembly of the transcriptome of the glassy-winged sharpshooter (Homalodisca vitripennis.

    Directory of Open Access Journals (Sweden)

    Raja Sekhar Nandety

    Full Text Available BACKGROUND: The glassy-winged sharpshooter Homalodisca vitripennis (Hemiptera: Cicadellidae, is a xylem-feeding leafhopper and important vector of the bacterium Xylella fastidiosa; the causal agent of Pierce's disease of grapevines. The functional complexity of the transcriptome of H. vitripennis has not been elucidated thus far. It is a necessary blueprint for an understanding of the development of H. vitripennis and for designing efficient biorational control strategies including those based on RNA interference. RESULTS: Here we elucidate and explore the transcriptome of adult H. vitripennis using high-throughput paired end deep sequencing and de novo assembly. A total of 32,803,656 paired-end reads were obtained with an average transcript length of 624 nucleotides. We assembled 32.9 Mb of the transcriptome of H. vitripennis that spanned across 47,265 loci and 52,708 transcripts. Comparison of our non-redundant database showed that 45% of the deduced proteins of H. vitripennis exhibit identity (e-value ≤1(-5 with known proteins. We assigned Gene Ontology (GO terms, Kyoto Encyclopedia of Genes and Genomes (KEGG annotations, and potential Pfam domains to each transcript isoform. In order to gain insight into the molecular basis of key regulatory genes of H. vitripennis, we characterized predicted proteins involved in the metabolism of juvenile hormone, and biogenesis of small RNAs (Dicer and Piwi sequences from the transcriptomic sequences. Analysis of transposable element sequences of H. vitripennis indicated that the genome is less expanded in comparison to many other insects with approximately 1% of the transcriptome carrying transposable elements. CONCLUSIONS: Our data significantly enhance the molecular resources available for future study and control of this economically important hemipteran. This transcriptional information not only provides a more nuanced understanding of the underlying biological and physiological mechanisms that

  3. Primary structure of human alpha 2-macroglobulin. I. Isolation of the 26 CNBr fragments, amino acid sequence of 13 small CNBr fragments, amino acid sequence of methionine-containing peptides, and alignment of all CNBr fragments

    DEFF Research Database (Denmark)

    Sottrup-Jensen, Lars; Stepanik, T M; Jones, C M

    1984-01-01

    -775). These fragments account for 603 of the 1451 residues of the subunits of alpha 2-macroglobulin. CB2 contains two glucosamine-based carbohydrate groups attached to Asn-23 and Asn-38, and one internal disulfide bridge connecting Cys-16 with Cys-54. CB6 contains one glucosamine-based carbohydrate group attached...... to Asn-1 and two internal disulfide bridges (Cys-5 bound to Cys-53 and Cys-23 bound to Cys-41, respectively); Cys-32 is bound to Cys-16 in CB8. CB7 contains two glucosamine-based carbohydrate groups attached to Asn-78 and Asn-92, CB8 contains 1 Cys residue (Cys-16), bridged to Cys-32 of CB6. CB11...

  4. De novo assembly and characterization of the garlic (Allium sativum) bud transcriptome by Illumina sequencing.

    Science.gov (United States)

    Sun, Xiudong; Zhou, Shumei; Meng, Fanlu; Liu, Shiqi

    2012-10-01

    Garlic is widely used as a spice throughout the world for the culinary value of its flavor and aroma, which are created by the chemical transformation of a series of organic sulfur compounds. To analyze the transcriptome of Allium sativum and discover the genes involved in sulfur metabolism, cDNAs derived from the total RNA of Allium sativum buds were analyzed by Illumina sequencing. Approximately 26.67 million 90 bp paired-end clean reads were achieved in two libraries. A total of 127,933 unigenes were generated by de novo assembly and were compared with the sequences in public databases. Of these, 45,286 unigenes had significant hits to the sequences in the Nr database, 29,514 showed significant similarity to known proteins in the Swiss-Prot database and, 20,706 and 21,952 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Moreover, genes involved in organic sulfur biosynthesis were identified. These unigenes data will provide the foundation for research on gene expression, genomics and functional genomics in Allium sativum. Key message The obtained unigenes will provide the foundation for research on functional genomics in Allium sativum and its closely related species, and fill the gap of the existing plant EST database.

  5. Draft Sequencing of the Heterozygous Diploid Genome of Satsuma (Citrus unshiu Marc. Using a Hybrid Assembly Approach

    Directory of Open Access Journals (Sweden)

    Tokurou Shimizu

    2017-12-01

    Full Text Available Satsuma (Citrus unshiu Marc. is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma (“Miyagawa Wase” was conducted by a hybrid assembly approach using short-read sequences, three mate-pair libraries, and a long-read sequence of PacBio by the PLATANUS assembler. The assembled sequence, with a total size of 359.7 Mb at the N50 length of 386,404 bp, consisted of 20,876 scaffolds. Pseudomolecules of Satsuma constructed by aligning the scaffolds to three genetic maps showed genome-wide synteny to the genomes of Clementine, pummelo, and sweet orange. Gene prediction by modeling with MAKER-P proposed 29,024 genes and 37,970 mRNA; additionally, gene prediction analysis found candidates for novel genes in several biosynthesis pathways for gibberellin and violaxanthin catabolism. BUSCO scores for the assembled scaffold and predicted transcripts, and another analysis by BAC end sequence mapping indicated the assembled genome consistency was close to those of the haploid Clementine, pummel, and sweet orange genomes. The number of repeat elements and long terminal repeat retrotransposon were comparable to those of the seven citrus genomes; this suggested no significant failure in the assembly at the repeat region. A resequencing application using the assembled sequence confirmed that both kunenbo-A and Satsuma are offsprings of Kishu, and Satsuma is a back-crossed offspring of Kishu. These results illustrated the performance of the hybrid assembly approach and its ability to construct an accurate heterozygous diploid genome.

  6. Sequencing, de novo assembling, and annotating the genome of the endangered Chinese crocodile lizard Shinisaurus crocodilurus.

    Science.gov (United States)

    Gao, Jian; Li, Qiye; Wang, Zongji; Zhou, Yang; Martelli, Paolo; Li, Fang; Xiong, Zijun; Wang, Jian; Yang, Huanming; Zhang, Guojie

    2017-07-01

    The Chinese crocodile lizard, Shinisaurus crocodilurus, is the only living representative of the monotypic family Shinisauridae under the order Squamata. It is an obligate semi-aquatic, viviparous, diurnal species restricted to specific portions of mountainous locations in southwestern China and northeastern Vietnam. However, in the past several decades, this species has undergone a rapid decrease in population size due to illegal poaching and habitat disruption, making this unique reptile species endangered and listed in the Convention on International Trade in Endangered Species of Wild Fauna and Flora Appendix II since 1990. A proposal to uplist it to Appendix I was passed at the Convention on International Trade in Endangered Species of Wild Fauna and Flora Seventeenth meeting of the Conference of the Parties in 2016. To promote the conservation of this species, we sequenced the genome of a male Chinese crocodile lizard using a whole-genome shotgun strategy on the Illumina HiSeq 2000 platform. In total, we generated ∼291 Gb of raw sequencing data (×149 depth) from 13 libraries with insert sizes ranging from 250 bp to 40 kb. After filtering for polymerase chain reaction-duplicated and low-quality reads, ∼137 Gb of clean data (×70 depth) were obtained for genome assembly. We yielded a draft genome assembly with a total length of 2.24 Gb and an N50 scaffold size of 1.47 Mb. The assembled genome was predicted to contain 20 150 protein-coding genes and up to 1114 Mb (49.6%) of repetitive elements. The genomic resource of the Chinese crocodile lizard will contribute to deciphering the biology of this organism and provides an essential tool for conservation efforts. It also provides a valuable resource for future study of squamate evolution. © The Authors 2017. Published by Oxford University Press.

  7. Genome-wide SNP identification by high-throughput sequencing and selective mapping allows sequence assembly positioning using a framework genetic linkage map

    Directory of Open Access Journals (Sweden)

    Xu Xiangming

    2010-12-01

    Full Text Available Abstract Background Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method. Results The strategy was tested on a draft genome of the fungal pathogen Venturia inaequalis, the causal agent of apple scab, and further validated using sequence contigs derived from the diploid plant genome Fragaria vesca. Using our novel method we were able to anchor 70% and 92% of sequences assemblies for V. inaequalis and F. vesca, respectively, to genetic linkage maps. Conclusions We demonstrated the utility of this approach by accurately determining the bin map positions of the majority of the large sequence contigs from each genome sequence and validated our method by mapping single sequence repeat markers derived from sequence contigs on a full mapping population.

  8. The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads

    DEFF Research Database (Denmark)

    Wang, Zhiwen; Hobson, Neil; Galindo, Leonardo

    2012-01-01

    Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp...... these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species....

  9. Effect of DNA sequence of Fab fragment on yield characteristics and cell growth of E. coli.

    Science.gov (United States)

    Kulmala, Antti; Huovinen, Tuomas; Lamminmäki, Urpo

    2017-06-19

    Codon usage is one of the factors influencing recombinant protein expression. We were interested in the codon usage of an antibody Fab fragment gene exhibiting extreme toxicity in the E. coli host. The toxic synthetic human Fab gene contained domains optimized by the "one amino acid-one codon" method. We redesigned five segments of the Fab gene with a "codon harmonization" method described by Angov et al. and studied the effects of these changes on cell viability, Fab yield and display on filamentous phage using different vectors and bacterial strains. The harmonization considerably reduced toxicity, increased Fab expression from negligible levels to 10 mg/l, and restored the display on phage. Testing the impact of the individual redesigned segments revealed that the most significant effects were conferred by changes in the constant domain of the light chain. For some of the Fab gene variants, we also observed striking differences in protein yields when cloned from a chloramphenicol resistant vector into an identical vector, except with ampicillin resistance. In conclusion, our results show that the expression of a heterodimeric secretory protein can be improved by harmonizing selected DNA segments by synonymous codons and reveal additional complexity involved in heterologous protein expression.

  10. In vivo tumor targeting and imaging with engineered trivalent antibody fragments containing collagen-derived sequences.

    Directory of Open Access Journals (Sweden)

    Angel M Cuesta

    Full Text Available There is an urgent need to develop new and effective agents for cancer targeting. In this work, a multivalent antibody is characterized in vivo in living animals. The antibody, termed "trimerbody", comprises a single-chain antibody (scFv fragment connected to the N-terminal trimerization subdomain of collagen XVIII NC1 by a flexible linker. As indicated by computer graphic modeling, the trimerbody has a tripod-shaped structure with three highly flexible scFv heads radially outward oriented. Trimerbodies are trimeric in solution and exhibited multivalent binding, which provides them with at least a 100-fold increase in functional affinity than the monovalent scFv. Our results also demonstrate the feasibility of producing functional bispecific trimerbodies, which concurrently bind two different ligands. A trimerbody specific for the carcinoembryonic antigen (CEA, a classic tumor-associated antigen, showed efficient tumor targeting after systemic administration in mice bearing CEA-positive tumors. Importantly, a trimerbody that recognizes an angiogenesis-associated laminin epitope, showed excellent tumor localization in several cancer types, including fibrosarcomas and carcinomas. These results illustrate the potential of this new antibody format for imaging and therapeutic applications, and suggest that some laminin epitopes might be universal targets for cancer targeting.

  11. High-coverage sequencing and annotated assembly of the genome of the Australian dragon lizard Pogona vitticeps.

    Science.gov (United States)

    Georges, Arthur; Li, Qiye; Lian, Jinmin; O'Meally, Denis; Deakin, Janine; Wang, Zongji; Zhang, Pei; Fujita, Matthew; Patel, Hardip R; Holleley, Clare E; Zhou, Yang; Zhang, Xiuwen; Matsubara, Kazumi; Waters, Paul; Graves, Jennifer A Marshall; Sarre, Stephen D; Zhang, Guojie

    2015-01-01

    The lizards of the family Agamidae are one of the most prominent elements of the Australian reptile fauna. Here, we present a genomic resource built on the basis of a wild-caught male ZZ central bearded dragon Pogona vitticeps. The genomic sequence for P. vitticeps, generated on the Illumina HiSeq 2000 platform, comprised 317 Gbp (179X raw read depth) from 13 insert libraries ranging from 250 bp to 40 kbp. After filtering for low-quality and duplicated reads, 146 Gbp of data (83X) was available for assembly. Exceptionally high levels of heterozygosity (0.85 % of single nucleotide polymorphisms plus sequence insertions or deletions) complicated assembly; nevertheless, 96.4 % of reads mapped back to the assembled scaffolds, indicating that the assembly included most of the sequenced genome. Length of the assembly was 1.8 Gbp in 545,310 scaffolds (69,852 longer than 300 bp), the longest being 14.68 Mbp. N50 was 2.29 Mbp. Genes were annotated on the basis of de novo prediction, similarity to the green anole Anolis carolinensis, Gallus gallus and Homo sapiens proteins, and P. vitticeps transcriptome sequence assemblies, to yield 19,406 protein-coding genes in the assembly, 63 % of which had intact open reading frames. Our assembly captured 99 % (246 of 248) of core CEGMA genes, with 93 % (231) being complete. The quality of the P. vitticeps assembly is comparable or superior to that of other published squamate genomes, and the annotated P. vitticeps genome can be accessed through a genome browser available at https://genomics.canberra.edu.au.

  12. Choice of reference sequence and assembler for alignment of Listeria monocytogenes short-read sequence data greatly influences rates of error in SNP analyses.

    Directory of Open Access Journals (Sweden)

    Arthur W Pightling

    Full Text Available The wide availability of whole-genome sequencing (WGS and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i depth of sequencing coverage, ii choice of reference-guided short-read sequence assembler, iii choice of reference genome, and iv whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT, using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming. We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers

  13. Elevation or Suppression? The Resolved Star Formation Main Sequence of Galaxies with Two Different Assembly Modes

    Science.gov (United States)

    Liu, Qing; Wang, Enci; Lin, Zesen; Gao, Yulong; Liu, Haiyang; Berhane Teklu, Berzaf; Kong, Xu

    2018-04-01

    We investigate the spatially resolved star formation main sequence in star-forming galaxies using Integral Field Spectroscopic observations from the Mapping Nearby Galaxies at the Apache Point Observatory survey. We demonstrate that the correlation between the stellar mass surface density (Σ*) and star formation rate surface density (ΣSFR) holds down to the sub-galactic scale, leading to the sub-galactic main sequence (SGMS). By dividing galaxies into two populations based on their recent mass assembly modes, we find the resolved main sequence in galaxies with the “outside-in” mode is steeper than that in galaxies with the “inside-out” mode. This is also confirmed on a galaxy-by-galaxy level, where we find the distributions of SGMS slopes for individual galaxies are clearly separated for the two populations. When normalizing and stacking the SGMS of individual galaxies on one panel for the two populations, we find that the inner regions of galaxies with the “inside-out” mode statistically exhibit a suppression in star formation, with a less significant trend in the outer regions of galaxies with the “outside-in” mode. In contrast, the inner regions of galaxies with “outside-in” mode and the outer regions of galaxies with “inside-out” mode follow a slightly sublinear scaling relation with a slope ∼0.9, which is in good agreement with previous findings, suggesting that they are experiencing a universal regulation without influences of additional physical processes.

  14. Computational complexity of algorithms for sequence comparison, short-read assembly and genome alignment.

    Science.gov (United States)

    Baichoo, Shakuntala; Ouzounis, Christos A

    A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome.

    Science.gov (United States)

    Weisberg, Alexandra J; Kim, Gunjune; Westwood, James H; Jelesko, John G

    2017-11-10

    Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is "leaves of three, let it be", which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species.

  16. Codon-Precise, Synthetic, Antibody Fragment Libraries Built Using Automated Hexamer Codon Additions and Validated through Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Laura Frigotto

    2015-05-01

    Full Text Available We have previously described ProxiMAX, a technology that enables the fabrication of precise, combinatorial gene libraries via codon-by-codon saturation mutagenesis. ProxiMAX was originally performed using manual, enzymatic transfer of codons via blunt-end ligation. Here we present Colibra™: an automated, proprietary version of ProxiMAX used specifically for antibody library generation, in which double-codon hexamers are transferred during the saturation cycling process. The reduction in process complexity, resulting library quality and an unprecedented saturation of up to 24 contiguous codons are described. Utility of the method is demonstrated via fabrication of complementarity determining regions (CDR in antibody fragment libraries and next generation sequencing (NGS analysis of their quality and diversity.

  17. Characterization of European Yersinia enterocolitica 1A strains using restriction fragment length polymorphism and multilocus sequence analysis.

    Science.gov (United States)

    Murros, A; Säde, E; Johansson, P; Korkeala, H; Fredriksson-Ahomaa, M; Björkroth, J

    2016-10-01

    Yersinia enterocolitica is currently divided into two subspecies: subsp. enterocolitica including highly pathogenic strains of biotype 1B and subsp. palearctica including nonpathogenic strains of biotype 1A and moderately pathogenic strains of biotypes 2-5. In this work, we characterized 162 Y. enterocolitica strains of biotype 1A and 50 strains of biotypes 2-4 isolated from human, animal and food samples by restriction fragment length polymorphism using the HindIII restriction enzyme. Phylogenetic relatedness of 20 representative Y. enterocolitica strains including 15 biotype 1A strains was further studied by the multilocus sequence analysis of four housekeeping genes (glnA, gyrB, recA and HSP60). In all the analyses, biotype 1A strains formed a separate genomic group, which differed from Y. enterocolitica subsp. enterocolitica and from the strains of biotypes 2-4 of Y. enterocolitica subsp. palearctica. Based on these results, biotype 1A strains considered nonpathogenic should not be included in subspecies palearctica containing pathogenic strains of biotypes 2-5. Yersinia enterocolitica strains are currently divided into six biotypes and two subspecies. Strains of biotype 1A, which are phenotypically and genotypically very heterogeneous, are classified as subspecies palearctica. In this study, European Y. enterocolitica 1A strains isolated from both human and nonhuman sources were characterized using restriction fragment length polymorphism and multilocus sequence analysis. The European biotype 1A strains formed a separate group, which differed from strains belonging to subspecies enterocolitica and palearctica. This may indicate that the current division between the two subspecies is not sufficient considering the strain diversity within Y. enterocolitica. © 2016 The Society for Applied Microbiology.

  18. Growth of rat dorsal root ganglion neurons on a novel self-assembling scaffold containing IKVAV sequence

    Energy Technology Data Exchange (ETDEWEB)

    Zou Zhenwei; Zheng Qixin [Department of Orthopaedics, Union Hospital, Tongji Medical college of Huazhong University of science and technology, Wuhan, 430022 (China); Wu Yongchao, E-mail: wuyongchao@hotmail.com [Department of Orthopaedics, Union Hospital, Tongji Medical college of Huazhong University of science and technology, Wuhan, 430022 (China); Song Yulin; Wu Bin [Department of Orthopaedics, Union Hospital, Tongji Medical college of Huazhong University of science and technology, Wuhan, 430022 (China)

    2009-08-31

    The potential benefits of self-assembly in synthesizing materials for the treatment of both peripheral and central nervous system disorders are tremendous. In this study, we synthesized peptide-amphiphile (PA) molecules containing IKVAV sequence and induced self-assembly of the PA solutions in vitro to form nanofiber gels. Then, we tested the characterization of gels by transmission electron microscopy and demonstrated the biocompatibility of this gel towards rat dorsal root ganglion neurons. The nanofiber gel was formed by self-assembly of IKVAV PA molecules, which was triggered by metal ions. The fibers were 7-8 nm in diameter and with lengths of hundreds of nanometers. Gels were shown to be non-toxic to neurons and able to promote neurons adhesion and neurite sprouting. The results indicated that the self-assembling scaffold containing IKVAV sequence had excellent biocompatibility with adult sensory neurons and could be useful in nerve tissue engineering.

  19. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

    Directory of Open Access Journals (Sweden)

    Ritland Carol

    2009-08-01

    Full Text Available Abstract Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs and full-length (FLcDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR and a cytochrome P450 (CYP720B4 from a non-arrayed genomic BAC library of white spruce (Picea glauca. Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR and 94 kbp (CYP720B4 long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs, high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene

  20. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome.

    Science.gov (United States)

    Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg

    2009-08-06

    Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The

  1. Spike protein assembly into the coronavirion: exploring the limits of its sequence requirements

    International Nuclear Information System (INIS)

    Bosch, Berend Jan; Haan, Cornelis A.M. de; Smits, Saskia L.; Rottier, Peter J.M.

    2005-01-01

    The coronavirus spike (S) protein, required for receptor binding and membrane fusion, is incorporated into the assembling virion by interactions with the viral membrane (M) protein. Earlier we showed that the ectodomain of the S protein is not involved in this process. Here we further defined the requirements of the S protein for virion incorporation. We show that the cytoplasmic domain, not the transmembrane domain, determines the association with the M protein and suffices to effect the incorporation into viral particles of chimeric spikes as well as of foreign viral glycoproteins. The essential sequence was mapped to the membrane-proximal region of the cytoplasmic domain, which is also known to be of critical importance for the fusion function of the S protein. Consistently, only short C-terminal truncations of the S protein were tolerated when introduced into the virus by targeted recombination. The important role of the about 38-residues cytoplasmic domain in the assembly of and membrane fusion by this approximately 1300 amino acids long protein is discussed

  2. De Novo Sequencing and Assembly Analysis of Transcriptome in Pinus bungeana Zucc. ex Endl.

    Directory of Open Access Journals (Sweden)

    Qifei Cai

    2018-03-01

    Full Text Available To enrich the molecular data of Pinus bungeana Zucc. ex Endl. and study the regulating factors of different morphology controled by apical dominance. In this study, de novo assembly of transcriptome annotation was performed for two varieties of Pinus bungeana Zucc. ex Endl. that are obviously different in morphology. More than 147 million reads were produced, which were assembled into 88,092 unigenes. Based on a similarity search, 11,692 unigenes showed significant similarity to proteins from Picea sitchensis (Bong. Carr. From this collection of unigenes, a large number of molecular markers were identified, including 2829 simple sequence repeats (SSRs. A total of 158 unigenes expressed differently between two varieties, including 98 up-regulated and 60 down-regulated unigenes. Furthermore, among the differently expressed genes (DEGs, five genes which may impact the plant morphology were further validated by reverse transcription quantitative polymerase chain reaction (RT-qPCR. The five genes related to cytokinin oxidase/dehydrogenase (CKX, two-component response regulator ARR-A family (ARR-A, plant hormone signal transduction (AHP, and MADS-box transcription factors have a close relationship with apical dominance. This new dataset will be a useful resource for future genetic and genomic studies in Pinus bungeana Zucc. ex Endl.

  3. Knowledge-based decision support for Space Station assembly sequence planning

    Science.gov (United States)

    1991-04-01

    A complete Personal Analysis Assistant (PAA) for Space Station Freedom (SSF) assembly sequence planning consists of three software components: the system infrastructure, intra-flight value added, and inter-flight value added. The system infrastructure is the substrate on which software elements providing inter-flight and intra-flight value-added functionality are built. It provides the capability for building representations of assembly sequence plans and specification of constraints and analysis options. Intra-flight value-added provides functionality that will, given the manifest for each flight, define cargo elements, place them in the National Space Transportation System (NSTS) cargo bay, compute performance measure values, and identify violated constraints. Inter-flight value-added provides functionality that will, given major milestone dates and capability requirements, determine the number and dates of required flights and develop a manifest for each flight. The current project is Phase 1 of a projected two phase program and delivers the system infrastructure. Intra- and inter-flight value-added were to be developed in Phase 2, which has not been funded. Based on experience derived from hundreds of projects conducted over the past seven years, ISX developed an Intelligent Systems Engineering (ISE) methodology that combines the methods of systems engineering and knowledge engineering to meet the special systems development requirements posed by intelligent systems, systems that blend artificial intelligence and other advanced technologies with more conventional computing technologies. The ISE methodology defines a phased program process that begins with an application assessment designed to provide a preliminary determination of the relative technical risks and payoffs associated with a potential application, and then moves through requirements analysis, system design, and development.

  4. Exact algorithms for haplotype assembly from whole-genome sequence data.

    Science.gov (United States)

    Chen, Zhi-Zhong; Deng, Fei; Wang, Lusheng

    2013-08-15

    Haplotypes play a crucial role in genetic analysis and have many applications such as gene disease diagnoses, association studies, ancestry inference and so forth. The development of DNA sequencing technologies makes it possible to obtain haplotypes from a set of aligned reads originated from both copies of a chromosome of a single individual. This approach is often known as haplotype assembly. Exact algorithms that can give optimal solutions to the haplotype assembly problem are highly demanded. Unfortunately, previous algorithms for this problem either fail to output optimal solutions or take too long time even executed on a PC cluster. We develop an approach to finding optimal solutions for the haplotype assembly problem under the minimum-error-correction (MEC) model. Most of the previous approaches assume that the columns in the input matrix correspond to (putative) heterozygous sites. This all-heterozygous assumption is correct for most columns, but it may be incorrect for a small number of columns. In this article, we consider the MEC model with or without the all-heterozygous assumption. In our approach, we first use new methods to decompose the input read matrix into small independent blocks and then model the problem for each block as an integer linear programming problem, which is then solved by an integer linear programming solver. We have tested our program on a single PC [a Linux (x64) desktop PC with i7-3960X CPU], using the filtered HuRef and the NA 12878 datasets (after applying some variant calling methods). With the all-heterozygous assumption, our approach can optimally solve the whole HuRef data set within a total time of 31 h (26 h for the most difficult block of the 15th chromosome and only 5 h for the other blocks). To our knowledge, this is the first time that MEC optimal solutions are completely obtained for the filtered HuRef dataset. Moreover, in the general case (without the all-heterozygous assumption), for the HuRef dataset our

  5. Brain transcriptome sequencing and assembly of three songbird model systems for the study of social behavior

    Directory of Open Access Journals (Sweden)

    Christopher N. Balakrishnan

    2014-05-01

    Full Text Available Emberizid sparrows (emberizidae have played a prominent role in the study of avian vocal communication and social behavior. We present here brain transcriptomes for three emberizid model systems, song sparrow Melospiza melodia, white-throated sparrow Zonotrichia albicollis, and Gambel’s white-crowned sparrow Zonotrichia leucophrys gambelii. Each of the assemblies covered fully or in part, over 89% of the previously annotated protein coding genes in the zebra finch Taeniopygia guttata, with 16,846, 15,805, and 16,646 unique BLAST hits in song, white-throated and white-crowned sparrows, respectively. As in previous studies, we find tissue of origin (auditory forebrain versus hypothalamus and whole brain as an important determinant of overall expression profile. We also demonstrate the successful isolation of RNA and RNA-sequencing from post-mortem samples from building strikes and suggest that such an approach could be useful when traditional sampling opportunities are limited. These transcriptomes will be an important resource for the study of social behavior in birds and for data driven annotation of forthcoming whole genome sequences for these and other bird species.

  6. Population structure of pigs determined by single nucleotide polymorphisms observed in assembled expressed sequence tags.

    Science.gov (United States)

    Matsumoto, Toshimi; Okumura, Naohiko; Uenishi, Hirohide; Hayashi, Takeshi; Hamasima, Noriyuki; Awata, Takashi

    2012-01-01

    We have collected more than 190000 porcine expressed sequence tags (ESTs) from full-length complementary DNA (cDNA) libraries and identified more than 2800 single nucleotide polymorphisms (SNPs). In this study, we tentatively chose 222 SNPs observed in assembled ESTs to study pigs of different breeds; 104 were selected by comparing the cDNA sequences of a Meishan pig and samples of three-way cross pigs (Landrace, Large White, and Duroc: LWD), and 118 were selected from LWD samples. To evaluate the genetic variation between the chosen SNPs from pig breeds, we determined the genotypes for 192 pig samples (11 pig groups) from our DNA reference panel with matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Of the 222 reference SNPs, 186 were successfully genotyped. A neighbor-joining tree showed that the pig groups were classified into two large clusters, namely, Euro-American and East Asian pig populations. F-statistics and the analysis of molecular variance of Euro-American pig groups revealed that approximately 25% of the genetic variations occurred because of intergroup differences. As the F(IS) values were less than the F(ST) values(,) the clustering, based on the Bayesian inference, implied that there was strong genetic differentiation among pig groups and less divergence within the groups in our samples. © 2011 The Authors. Animal Science Journal © 2011 Japanese Society of Animal Science.

  7. Sequencing and de novo transcriptome assembly of the Chinese giant salamander (Andrias davidianus

    Directory of Open Access Journals (Sweden)

    Yong Huang

    2017-06-01

    Full Text Available Next-generation technologies for determination of genomics and transcriptomics composition have a wide range of applications. Andrias davidianus, has become an endangered amphibian species of salamander endemic in China. However, there is a lack of the molecular information. In this study, we obtained the RNA-Seq data from a pool of A. davidianus tissue including spleen, liver, muscle, kidney, skin, testis, gut and heart using Illumina HiSeq 2500 platform. A total of 15,398,997,600 bp were obtained, corresponding to 102,659,984 raw reads. A total of 102,659,984 reads were filtered after removing low-quality reads and trimming the adapter sequences. The Trinity program was used to de novo assemble 132,912 unigenes with an average length of 690 bp and N50 of 1263 bp. Unigenes were annotated through number of databases. These transcriptomic data of A. davidianus should open the door to molecular evolution studies based on the entire transcriptome or targeted genes of interest to sequence. The raw data in this study can be available in NCBI SRA database with accession number of SRP099564.

  8. Assembly and comparative analysis of complete mitochondrial genome sequence of an economic plant Salix suchowensis

    Directory of Open Access Journals (Sweden)

    Ning Ye

    2017-03-01

    Full Text Available Willow is a widely used dioecious woody plant of Salicaceae family in China. Due to their high biomass yields, willows are promising sources for bioenergy crops. In this study, we assembled the complete mitochondrial (mt genome sequence of S. suchowensis with the length of 644,437 bp using Roche-454 GS FLX Titanium sequencing technologies. Base composition of the S. suchowensis mt genome is A (27.43%, T (27.59%, C (22.34%, and G (22.64%, which shows a prevalent GC content with that of other angiosperms. This long circular mt genome encodes 58 unique genes (32 protein-coding genes, 23 tRNA genes and 3 rRNA genes, and 9 of the 32 protein-coding genes contain 17 introns. Through the phylogenetic analysis of 35 species based on 23 protein-coding genes, it is supported that Salix as a sister to Populus. With the detailed phylogenetic information and the identification of phylogenetic position, some ribosomal protein genes and succinate dehydrogenase genes are found usually lost during evolution. As a native shrub willow species, this worthwhile research of S. suchowensis mt genome will provide more desirable information for better understanding the genomic breeding and missing pieces of sex determination evolution in the future.

  9. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

    Science.gov (United States)

    Senol Cali, Damla; Kim, Jeremie S; Ghose, Saugata; Alkan, Can; Mutlu, Onur

    2018-04-02

    Nanopore sequencing technology has the potential to render other sequencing technologies obsolete with its ability to generate long reads and provide portability. However, high error rates of the technology pose a challenge while generating accurate genome assemblies. The tools used for nanopore sequence analysis are of critical importance, as they should overcome the high error rates of the technology. Our goal in this work is to comprehensively analyze current publicly available tools for nanopore sequence analysis to understand their advantages, disadvantages and performance bottlenecks. It is important to understand where the current tools do not perform well to develop better tools. To this end, we (1) analyze the multiple steps and the associated tools in the genome assembly pipeline using nanopore sequence data, and (2) provide guidelines for determining the appropriate tools for each step. Based on our analyses, we make four key observations: (1) the choice of the tool for basecalling plays a critical role in overcoming the high error rates of nanopore sequencing technology. (2) Read-to-read overlap finding tools, GraphMap and Minimap, perform similarly in terms of accuracy. However, Minimap has a lower memory usage, and it is faster than GraphMap. (3) There is a trade-off between accuracy and performance when deciding on the appropriate tool for the assembly step. The fast but less accurate assembler Miniasm can be used for quick initial assembly, and further polishing can be applied on top of it to increase the accuracy, which leads to faster overall assembly. (4) The state-of-the-art polishing tool, Racon, generates high-quality consensus sequences while providing a significant speedup over another polishing tool, Nanopolish. We analyze various combinations of different tools and expose the trade-offs between accuracy, performance, memory usage and scalability. We conclude that our observations can guide researchers and practitioners in making conscious

  10. Impact of a Central Scaffold on the Binding Affinity of Fragment Pairs Isolated from DNA-Encoded Self-Assembling Chemical Libraries.

    Science.gov (United States)

    Bigatti, Martina; Dal Corso, Alberto; Vanetti, Sara; Cazzamalli, Samuele; Rieder, Ulrike; Scheuermann, Jörg; Neri, Dario; Sladojevich, Filippo

    2017-11-08

    The screening of encoded self-assembling chemical libraries allows the identification of fragment pairs that bind to adjacent pockets on target proteins of interest. For practical applications, it is necessary to link these ligand pairs into discrete organic molecules, devoid of any nucleic acid component. Here we describe the discovery of a synergistic binding pair for acid alpha-1 glycoprotein and a chemical strategy for the identification of optimal linkers, connecting the two fragments. The procedure yielded a set of small organic ligands, the best of which exhibited a dissociation constant of 9.9 nm, as measured in solution by fluorescence polarization. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Measurement of the angular distribution of fission fragments using a PPAC assembly at CERN n{sub T}OF

    Energy Technology Data Exchange (ETDEWEB)

    Tarrío, D., E-mail: dtarriov@gmail.com [Universidade de Santiago de Compostela (Spain); Leong, L.S.; Audouin, L. [Centre National de la Recherche Scientifique/IN2P3 -Université Paris-Sud - IPN, Orsay (France); Duran, I.; Paradela, C. [Universidade de Santiago de Compostela (Spain); Tassan-Got, L.; Le Naour, C.; Bacri, C.O.; Petitbon, V.; Mottier, J. [Centre National de la Recherche Scientifique/IN2P3 -Université Paris-Sud - IPN, Orsay (France); Caamaño, M. [Universidade de Santiago de Compostela (Spain); Altstadt, S. [Johann-Wolfgang-Goethe Universität, Frankfurt (Germany); Andrzejewski, J. [Uniwersytet Łódzki, Lodz (Poland); Barbagallo, M. [Istituto Nazionale di Fisica Nucleare, Bari (Italy); Bécares, V. [Centro de Investigaciones Energeticas Medioambientales y Tecnológicas (CIEMAT), Madrid (Spain); Bečvář, F. [Charles University, Prague (Czech Republic); Belloni, F. [Commissariat à l’Énergie Atomique (CEA) Saclay - Irfu, Gif-sur-Yvette (France); Berthoumieux, E. [Commissariat à l’Énergie Atomique (CEA) Saclay - Irfu, Gif-sur-Yvette (France); European Organization for Nuclear Research (CERN), Geneva (Switzerland); Billowes, J. [University of Manchester, Oxford Road, Manchester (United Kingdom); Boccone, V. [European Organization for Nuclear Research (CERN), Geneva (Switzerland); and others

    2014-04-11

    A fission reaction chamber based on Parallel Plate Avalanche Counters (PPACs) was built for measuring angular distributions of fragments emitted in neutron-induced fission of actinides at the neutron beam available at the Neutron Time-Of-Flight (n{sub T}OF) facility at CERN. The detectors and the samples were tilted 45° with respect to the neutron beam direction to cover all the possible values of the emission angle of the fission fragments. The main features of this setup are discussed and results on the fission fragment angular distribution are provided for the {sup 232}Th(n,f) reaction around the fission threshold. The results are compared with the available data in the literature, demonstrating the good capabilities of this setup.

  12. De novo Assembly and Characterization of Cajanus scarabaeoides (L. Thouars Transcriptome by Paired-End Sequencing

    Directory of Open Access Journals (Sweden)

    Deepti Nigam

    2017-07-01

    Full Text Available Pigeonpea [Cajanus cajan (L. Millsp.] is a heat and drought resilient legume crop grown mostly in Asia and Africa. Pigeonpea is affected by various biotic (diseases and insect pests and abiotic stresses (salinity and water logging which limit the yield potential of this crop. However, resistance to all these constraints is not readily available in the cultivated genotypes and some of the wild relatives have been found to withstand these resistances. Thus, the utilization of crop wild relatives (CWR in pigeonpea breeding has been effective in conferring resistance, quality and breeding efficiency traits to this crop. Bud and leaf tissue of Cajanus scarabaeoides, a wild relative of pigeon pea were used for transcriptome profiling. Approximately 30 million clean reads filtered from raw reads by removal of adaptors, ambiguous reads and low-quality reads (3.02 gigabase pairs were generated by Illumina paired-end RNA-seq technology. All of these clean reads were pooled and assembled de novo into 1,17,007 transcripts using the Trinity. Finally, a total of 98,664 unigenes were derived with mean length of 396 bp and N50 values of 1393. The assembly produced significant mapping results (73.68% in BLASTN searches of the Glycine max CDS sequence database (Ensembl. Further, uniprot database of Viridiplantae was used for unigene annotation; 81,799 of 98,664 (82.90% unigenes were finally annotated with gene descriptions or conserved protein domains. Further, a total of 23,475 SSRs were identified in 27,321 unigenes. This data will provide useful information for mining of functionally important genes and SSR markers for pigeonpea improvement.

  13. Differentiation of mycoplasmalike organisms (MLOs) in European fruit trees by PCR using specific primers derived from the sequence of a chromosomal fragment of the apple proliferation MLO.

    Science.gov (United States)

    Jarausch, W; Saillard, C; Dosba, F; Bové, J M

    1994-01-01

    A 1.8-kb chromosomal DNA fragment of the mycoplasmalike organism (MLO) associated with apple proliferation was sequenced. Three putative open reading frames were observed on this fragment. The protein encoded by open reading frame 2 shows significant homologies with bacterial nitroreductases. From the nucleotide sequence four primer pairs for PCR were chosen to specifically amplify DNA from MLOs associated with European diseases of fruit trees. Primer pairs specific for (i) Malus-affecting MLOs, (ii) Malus- and Prunus-affecting MLOs, and (iii) Malus-, Prunus-, and Pyrus-affecting MLOs were obtained. Restriction enzyme analysis of the amplification products revealed restriction fragment length polymorphisms between Malus-, Prunus, and Pyrus-affecting MLOs as well as between different isolates of the apple proliferation MLO. No amplification with either primer pair could be obtained with DNA from 12 different MLOs experimentally maintained in periwinkle. Images PMID:7916180

  14. Construction of high resolution genetic linkage maps to improve the soybean genome sequence assembly Glyma1.01

    Science.gov (United States)

    A landmark in soybean research, Glyma1.01, the first whole genome sequence of variety Williams 82 (Glycine max L. Merr.) was completed in 2010 and is widely used. However, because the assembly was primarily built based on the linkage maps constructed with a limited number of markers and recombinant...

  15. Characterisation of Toxoplasma gondii isolates using polymerase chain reaction (PCR) and restriction fragment length polymorphism (RFLP) of the non-coding Toxoplasma gondii (TGR)-gene sequences

    DEFF Research Database (Denmark)

    Høgdall, Estrid; Vuust, Jens; Lind, Peter

    2000-01-01

    of using TGR gene variants as markers to distinguish among T. gondii isolates from different animals and different geographical sources. Based on the band patterns obtained by restriction fragment length polymorphism (RFLP) analysis of the polymerase chain reaction (PCR) amplified TGR sequences, the T...

  16. Characterization of transcriptome dynamics during watermelon fruit development: sequencing, assembly, annotation and gene expression profiles.

    Science.gov (United States)

    Guo, Shaogui; Liu, Jingan; Zheng, Yi; Huang, Mingyun; Zhang, Haiying; Gong, Guoyi; He, Hongju; Ren, Yi; Zhong, Silin; Fei, Zhangjun; Xu, Yong

    2011-09-21

    Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression

  17. Next-Generation Sequencing of the Chrysanthemum nankingense (Asteraceae) Transcriptome Permits Large-Scale Unigene Assembly and SSR Marker Discovery

    Science.gov (United States)

    Wang, Haibin; Jiang, Jiafu; Chen, Sumei; Qi, Xiangyu; Peng, Hui; Li, Pirui; Song, Aiping; Guan, Zhiyong; Fang, Weimin; Liao, Yuan; Chen, Fadi

    2013-01-01

    Background Simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Chrysanthemum is one of the largest genera in the Asteraceae family. Only few Chrysanthemum expressed sequence tag (EST) sequences have been acquired to date, so the number of available EST-SSR markers is very low. Methodology/Principal Findings Illumina paired-end sequencing technology produced over 53 million sequencing reads from C. nankingense mRNA. The subsequent de novo assembly yielded 70,895 unigenes, of which 45,789 (64.59%) unigenes showed similarity to the sequences in NCBI database. Out of 45,789 sequences, 107 have hits to the Chrysanthemum Nr protein database; 679 and 277 sequences have hits to the database of Helianthus and Lactuca species, respectively. MISA software identified a large number of putative EST-SSRs, allowing 1,788 primer pairs to be designed from the de novo transcriptome sequence and a further 363 from archival EST sequence. Among 100 primer pairs randomly chosen, 81 markers have amplicons and 20 are polymorphic for genotypes analysis in Chrysanthemum. The results showed that most (but not all) of the assays were transferable across species and that they exposed a significant amount of allelic diversity. Conclusions/Significance SSR markers acquired by transcriptome sequencing are potentially useful for marker-assisted breeding and genetic analysis in the genus Chrysanthemum and its related genera. PMID:23626799

  18. Predicting Post-Translational Modifications from Local Sequence Fragments Using Machine Learning Algorithms: Overview and Best Practices.

    Science.gov (United States)

    Tatjewski, Marcin; Kierczak, Marcin; Plewczynski, Dariusz

    2017-01-01

    Here, we present two perspectives on the task of predicting post translational modifications (PTMs) from local sequence fragments using machine learning algorithms. The first is the description of the fundamental steps required to construct a PTM predictor from the very beginning. These steps include data gathering, feature extraction, or machine-learning classifier selection. The second part of our work contains the detailed discussion of more advanced problems which are encountered in PTM prediction task. Probably the most challenging issues which we have covered here are: (1) how to address the training data class imbalance problem (we also present statistics describing the problem); (2) how to properly set up cross-validation folds with an approach which takes into account the homology of protein data records, to address this problem we present our folds-over-clusters algorithm; and (3) how to efficiently reach for new sources of learning features. Presented techniques and notes resulted from intense studies in the field, performed by our and other groups, and can be useful both for researchers beginning in the field of PTM prediction and for those who want to extend the repertoire of their research techniques.

  19. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo: genome assembly and analysis.

    Directory of Open Access Journals (Sweden)

    Rami A Dalloul

    2010-09-01

    Full Text Available A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo. Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.

  20. The sequence d(CGGCGGCCGC) self-assembles into a two dimensional rhombic DNA lattice

    International Nuclear Information System (INIS)

    Venkadesh, S.; Mandal, P.K.; Gautham, N.

    2011-01-01

    Highlights: → This is the first crystal structure of a four-way junction with sticky ends. → Four junction structures bind to each other and form a rhombic cavity. → Each rhombus binds to others to form 'infinite' 2D tiles. → This is an example of bottom-up fabrication of a DNA nano-lattice. -- Abstract: We report here the crystal structure of the partially self-complementary decameric sequence d(CGGCGGCCGC), which self assembles to form a four-way junction with sticky ends. Each junction binds to four others through Watson-Crick base pairing at the sticky ends to form a rhombic structure. The rhombuses bind to each other and form two dimensional tiles. The tiles stack to form the crystal. The crystal diffracted in the space group P1 to a resolution of 2.5 A. The junction has the anti-parallel stacked-X conformation like other junction structures, though the formation of the rhombic net noticeably alters the details of the junction geometry.

  1. Phylogenetic similarity of the canine parvovirus wild-type isolates on the basis of VP1/VP2 gene fragment sequence analysis.

    Science.gov (United States)

    Rypul, K; Chmielewski, R; Smielewska-Loś, E; Klimentowski, S

    2002-04-01

    Biological material was taken from dogs with diarrhoea. Faecal samples were taken from within live animals and intestinal tract fragments (i.e. small intestine, and stomach) were taken from dead animals. In total, 18 specimens were investigated from dogs housed alone or in large groups. To test for the presence of the virus, latex (On Site Biotech, Uppsala, Sweden) and direct immunofluorescence tests were performed. At the same time, polymerase chain reaction (PCR) with primers complementary to a conservative region of VP1/VP2 was carried out. The products of amplification were analysed on 2% agarose gel. The purified products were cloned with the Template Generation System (Finnzymes, Espoo, Finland) using a transposition reaction and positive clones were searched using the 'colony screening by PCR' method. The sequencing gave 12 sequences of VP1/VP2 gene fragments that were of high similarity. Among the 12 analysed sequences, six exhibited 88% similarity, four exhibited 100% similarity and two exhibited 71% similarity.

  2. AFLP fragment isolation technique as a method to produce random sequences for single nucleotide polymorphism discovery in the green turtle, Chelonia mydas.

    Science.gov (United States)

    Roden, Suzanne E; Dutton, Peter H; Morin, Phillip A

    2009-01-01

    The green sea turtle, Chelonia mydas, was used as a case study for single nucleotide polymorphism (SNP) discovery in a species that has little genetic sequence information available. As green turtles have a complex population structure, additional nuclear markers other than microsatellites could add to our understanding of their complex life history. Amplified fragment length polymorphism technique was used to generate sets of random fragments of genomic DNA, which were then electrophoretically separated with precast gels, stained with SYBR green, excised, and directly sequenced. It was possible to perform this method without the use of polyacrylamide gels, radioactive or fluorescent labeled primers, or hybridization methods, reducing the time, expense, and safety hazards of SNP discovery. Within 13 loci, 2547 base pairs were screened, resulting in the discovery of 35 SNPs. Using this method, it was possible to yield a sufficient number of loci to screen for SNP markers without the availability of prior sequence information.

  3. Sequence Identification, Recombinant Production, and Analysis of the Self-Assembly of Egg Stalk Silk Proteins from Lacewing Chrysoperla carnea.

    Science.gov (United States)

    Neuenfeldt, Martin; Scheibel, Thomas

    2017-06-13

    Egg stalk silks of the common green lacewing Chrysoperla carnea likely comprise at least three different silk proteins. Based on the natural spinning process, it was hypothesized that these proteins self-assemble without shear stress, as adult lacewings do not use a spinneret. To examine this, the first sequence identification and determination of the gene expression profile of several silk proteins and various transcript variants thereof was conducted, and then the three major proteins were recombinantly produced in Escherichia coli encoded by their native complementary DNA (cDNA) sequences. Circular dichroism measurements indicated that the silk proteins in aqueous solutions had a mainly intrinsically disordered structure. The largest silk protein, which we named ChryC1, exhibited a lower critical solution temperature (LCST) behavior and self-assembled into fibers or film morphologies, depending on the conditions used. The second silk protein, ChryC2, self-assembled into nanofibrils and subsequently formed hydrogels. Circular dichroism and Fourier transform infrared spectroscopy confirmed conformational changes of both proteins into beta sheet rich structures upon assembly. ChryC3 did not self-assemble into any morphology under the tested conditions. Thereby, through this work, it could be shown that recombinant lacewing silk proteins can be produced and further used for studying the fiber formation of lacewing egg stalks.

  4. Protection against β-amyloid neurotoxicity by a non-toxic endogenous N-terminal β-amyloid fragment and its active hexapeptide core sequence.

    Science.gov (United States)

    Forest, Kelly H; Alfulaij, Naghum; Arora, Komal; Taketa, Ruth; Sherrin, Tessi; Todorovic, Cedomir; Lawrence, James L M; Yoshikawa, Gene T; Ng, Ho-Leung; Hruby, Victor J; Nichols, Robert A

    2018-01-01

    High levels (μM) of beta amyloid (Aβ) oligomers are known to trigger neurotoxic effects, leading to synaptic impairment, behavioral deficits, and apoptotic cell death. The hydrophobic C-terminal domain of Aβ, together with sequences critical for oligomer formation, is essential for this neurotoxicity. However, Aβ at low levels (pM-nM) has been shown to function as a positive neuromodulator and this activity resides in the hydrophilic N-terminal domain of Aβ. An N-terminal Aβ fragment (1-15/16), found in cerebrospinal fluid, was also shown to be a highly active neuromodulator and to reverse Aβ-induced impairments of long-term potentiation. Here, we show the impact of this N-terminal Aβ fragment and a shorter hexapeptide core sequence in the Aβ fragment (Aβcore: 10-15) to protect or reverse Aβ-induced neuronal toxicity, fear memory deficits and apoptotic death. The neuroprotective effects of the N-terminal Aβ fragment and Aβcore on Aβ-induced changes in mitochondrial function, oxidative stress, and apoptotic neuronal death were demonstrated via mitochondrial membrane potential, live reactive oxygen species, DNA fragmentation and cell survival assays using a model neuroblastoma cell line (differentiated NG108-15) and mouse hippocampal neuron cultures. The protective action of the N-terminal Aβ fragment and Aβcore against spatial memory processing deficits in amyloid precursor protein/PSEN1 (5XFAD) mice was demonstrated in contextual fear conditioning. Stabilized derivatives of the N-terminal Aβcore were also shown to be fully protective against Aβ-triggered oxidative stress. Together, these findings indicate an endogenous neuroprotective role for the N-terminal Aβ fragment, while active stabilized N-terminal Aβcore derivatives offer the potential for therapeutic application. © 2017 International Society for Neurochemistry.

  5. An efficient approach to BAC based assembly of complex genomes.

    Science.gov (United States)

    Visendi, Paul; Berkman, Paul J; Hayashi, Satomi; Golicz, Agnieszka A; Bayer, Philipp E; Ruperao, Pradeep; Hurgobin, Bhavna; Montenegro, Juan; Chan, Chon-Kit Kenneth; Staňková, Helena; Batley, Jacqueline; Šimková, Hana; Doležel, Jaroslav; Edwards, David

    2016-01-01

    There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate 'gold' reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes.

  6. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    Science.gov (United States)

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  7. Transcriptome Sequencing, De Novo Assembly and Differential Gene Expression Analysis of the Early Development of Acipenser baeri.

    Directory of Open Access Journals (Sweden)

    Wei Song

    Full Text Available The molecular mechanisms that drive the development of the endangered fossil fish species Acipenser baeri are difficult to study due to the lack of genomic data. Recent advances in sequencing technologies and the reducing cost of sequencing offer exclusive opportunities for exploring important molecular mechanisms underlying specific biological processes. This manuscript describes the large scale sequencing and analyses of mRNA from Acipenser baeri collected at five development time points using the Illumina Hiseq2000 platform. The sequencing reads were de novo assembled and clustered into 278167 unigenes, of which 57346 (20.62% had 45837 known homologues proteins in Uniprot protein databases while 11509 proteins matched with at least one sequence of assembled unigenes. The remaining 79.38% of unigenes could stand for non-coding unigenes or unigenes specific to A. baeri. A number of 43062 unigenes were annotated into functional categories via Gene Ontology (GO annotation whereas 29526 unigenes were associated with 329 pathways by mapping to KEGG database. Subsequently, 3479 differentially expressed genes were scanned within developmental stages and clustered into 50 gene expression profiles. Genes preferentially expressed at each stage were also identified. Through GO and KEGG pathway enrichment analysis, relevant physiological variations during the early development of A. baeri could be better cognized. Accordingly, the present study gives insights into the transcriptome profile of the early development of A. baeri, and the information contained in this large scale transcriptome will provide substantial references for A. baeri developmental biology and promote its aquaculture research.

  8. Sequencing and assembly of low copy and genic regions of isolated Triticum aestivum chromosome arm 7DS

    Czech Academy of Sciences Publication Activity Database

    Berkman, O. J.; Skarshewski, A.; Lorenc, M. T.; Kubaláková, Marie; Šimková, Hana; Batley, J.; Doležel, Jaroslav; Edwards, D.

    2011-01-01

    Roč. 9, č. 7 (2011), s. 768-775 ISSN 1467-7644 R&D Projects: GA ČR GA521/07/1573; GA MŠk ED0007/01/01 Institutional research plan: CEZ:AV0Z50380511 Keywords : wheat genome sequence * chromosome 7 * genome assembly Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 5.442, year: 2011

  9. De novo assembly, gene annotation and marker development using Illumina paired-end transcriptome sequences in celery (Apium graveolens L..

    Directory of Open Access Journals (Sweden)

    Nan Fu

    Full Text Available BACKGROUND: Celery is an increasing popular vegetable species, but limited transcriptome and genomic data hinder the research to it. In addition, a lack of celery molecular markers limits the process of molecular genetic breeding. High-throughput transcriptome sequencing is an efficient method to generate a large transcriptome sequence dataset for gene discovery, molecular marker development and marker-assisted selection breeding. PRINCIPAL FINDINGS: Celery transcriptomes from four tissues were sequenced using Illumina paired-end sequencing technology. De novo assembling was performed to generate a collection of 42,280 unigenes (average length of 502.6 bp that represent the first transcriptome of the species. 78.43% and 48.93% of the unigenes had significant similarity with proteins in the National Center for Biotechnology Information (NCBI non-redundant protein database (Nr and Swiss-Prot database respectively, and 10,473 (24.77% unigenes were assigned to Clusters of Orthologous Groups (COG. 21,126 (49.97% unigenes harboring Interpro domains were annotated, in which 15,409 (36.45% were assigned to Gene Ontology(GO categories. Additionally, 7,478 unigenes were mapped onto 228 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG. Large numbers of simple sequence repeats (SSRs were indentified, and then the rate of successful amplication and polymorphism were investigated among 31 celery accessions. CONCLUSIONS: This study demonstrates the feasibility of generating a large scale of sequence information by Illumina paired-end sequencing and efficient assembling. Our results provide a valuable resource for celery research. The developed molecular markers are the foundation of further genetic linkage analysis and gene localization, and they will be essential to accelerate the process of breeding.

  10. Self-assembled monolayers of semi-fluorinated thiols and disulfides with a potentially antibacterial terminal fragment on gold surfaces

    International Nuclear Information System (INIS)

    Thebault, P.; Taffin de Givenchy, E.; Guittard, F.; Guimon, C.; Geribaldi, S.

    2008-01-01

    Attempts to elaborate the best organized cationic self-assembled monolayers (SAMs) with sulfur derivatives containing potentially bactericidal quaternary ammonium salt moieties have been performed on gold with the final aim to obtain contact-active antibacterial surfaces. Four molecules bearing two hydrocarbon spacers with different lengths between the sulfur atom and the quaternized nitrogen atom, and two different terminal semi-fluorinated alkyl chains have been synthesised and used in view to evaluate their capacity for leading to the highest densities and the highest organization of potentially active molecules on the metal surface. The formation and quality of SAMs characterized by X-ray photoelectron spectroscopy, Internal Reflexion Infra Red Imaging, contact angle and blocking factor measurements depend on the lengths of both the hydrocarbon spacer and terminal perfluorinated chain

  11. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Science.gov (United States)

    Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

    2010-04-08

    Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for

  12. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Directory of Open Access Journals (Sweden)

    Minou Nowrousian

    2010-04-01

    Full Text Available Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data

  13. Evidence for Sequence Scrambling and Divergent H/D Exchange Reactions of Doubly-Charged Isobaric b-Type Fragment Ions

    Science.gov (United States)

    Zekavat, Behrooz; Miladi, Mahsan; Al-Fdeilat, Abdullah H.; Somogyi, Arpad; Solouki, Touradj

    2014-02-01

    To date, only a limited number of reports are available on structural variants of multiply-charged b-fragment ions. We report on observed bimodal gas-phase hydrogen/deuterium exchange (HDX) reaction kinetics and patterns for substance P b10 2+ that point to presence of isomeric structures. We also compare HDX reactions, post-ion mobility/collision-induced dissociation (post-IM/CID), and sustained off-resonance irradiation-collision induced dissociation (SORI-CID) of substance P b10 2+ and a cyclic peptide with an identical amino acid (AA) sequence order to substance P b10. The observed HDX patterns and reaction kinetics and SORI-CID pattern for the doubly charged head-to-tail cyclized peptide were different from either of the presumed isomers of substance P b10 2+, suggesting that b10 2+ may not exist exclusively as a head-to-tail cyclized structure. Ultra-high mass measurement accuracy was used to assign identities of the observed SORI-CID fragment ions of substance P b10 2+; over 30 % of the observed SORI-CID fragment ions from substance P b10 2+ had rearranged (scrambled) AA sequences. Moreover, post-IM/CID experiments revealed the presence of two conformer types for substance P b10 2+, whereas only one conformer type was observed for the head-to-tail cyclized peptide. We also show that AA sequence scrambling from CID of doubly-charged b-fragment ions is not unique to substance P b10 2+.

  14. Evidence for sequence scrambling and divergent H/D exchange reactions of doubly-charged isobaric b-type fragment ions.

    Science.gov (United States)

    Zekavat, Behrooz; Miladi, Mahsan; Al-Fdeilat, Abdullah H; Somogyi, Arpad; Solouki, Touradj

    2014-02-01

    To date, only a limited number of reports are available on structural variants of multiply-charged b-fragment ions. We report on observed bimodal gas-phase hydrogen/deuterium exchange (HDX) reaction kinetics and patterns for substance P b10(2+) that point to presence of isomeric structures. We also compare HDX reactions, post-ion mobility/collision-induced dissociation (post-IM/CID), and sustained off-resonance irradiation-collision induced dissociation (SORI-CID) of substance P b10(2+) and a cyclic peptide with an identical amino acid (AA) sequence order to substance P b10. The observed HDX patterns and reaction kinetics and SORI-CID pattern for the doubly charged head-to-tail cyclized peptide were different from either of the presumed isomers of substance P b10(2+), suggesting that b10(2+) may not exist exclusively as a head-to-tail cyclized structure. Ultra-high mass measurement accuracy was used to assign identities of the observed SORI-CID fragment ions of substance P b10(2+); over 30% of the observed SORI-CID fragment ions from substance P b10(2+) had rearranged (scrambled) AA sequences. Moreover, post-IM/CID experiments revealed the presence of two conformer types for substance P b10(2+), whereas only one conformer type was observed for the head-to-tail cyclized peptide. We also show that AA sequence scrambling from CID of doubly-charged b-fragment ions is not unique to substance P b10(2+).

  15. Lactobacillus strain diversity based on partial hsp60 gene sequences and design of PCR-restriction fragment length polymorphism assays for species identification and differentiation.

    Science.gov (United States)

    Blaiotta, Giuseppe; Fusco, Vincenzina; Ercolini, Danilo; Aponte, Maria; Pepe, Olimpia; Villani, Francesco

    2008-01-01

    A phylogenetic tree showing diversities among 116 partial (499-bp) Lactobacillus hsp60 (groEL, encoding a 60-kDa heat shock protein) nucleotide sequences was obtained and compared to those previously described for 16S rRNA and tuf gene sequences. The topology of the tree produced in this study showed a Lactobacillus species distribution similar, but not identical, to those previously reported. However, according to the most recent systematic studies, a clear differentiation of 43 single-species clusters was detected/identified among the sequences analyzed. The slightly higher variability of the hsp60 nucleotide sequences than of the 16S rRNA sequences offers better opportunities to design or develop molecular assays allowing identification and differentiation of either distant or very closely related Lactobacillus species. Therefore, our results suggest that hsp60 can be considered an excellent molecular marker for inferring the taxonomy and phylogeny of members of the genus Lactobacillus and that the chosen primers can be used in a simple PCR procedure allowing the direct sequencing of the hsp60 fragments. Moreover, in this study we performed a computer-aided restriction endonuclease analysis of all 499-bp hsp60 partial sequences and we showed that the PCR-restriction fragment length polymorphism (RFLP) patterns obtainable by using both endonucleases AluI and TacI (in separate reactions) can allow identification and differentiation of all 43 Lactobacillus species considered, with the exception of the pair L. plantarum/L. pentosus. However, the latter species can be differentiated by further analysis with Sau3AI or MseI. The hsp60 PCR-RFLP approach was efficiently applied to identify and to differentiate a total of 110 wild Lactobacillus strains (including closely related species, such as L. casei and L. rhamnosus or L. plantarum and L. pentosus) isolated from cheese and dry-fermented sausages.

  16. Sequencing and de novo assembly of the Asian clam (Corbicula fluminea transcriptome using the Illumina GAIIx method.

    Directory of Open Access Journals (Sweden)

    Huihui Chen

    Full Text Available BACKGROUND: The Asian clam (Corbicula fluminea is currently one of the most economically important aquatic species in China and has been used as a test organism in many environmental studies. However, the lack of genomic resources, such as sequenced genome, expressed sequence tags (ESTs and transcriptome sequences has hindered the research on C. fluminea. Recent advances in large-scale RNA-Seq enable generation of genomic resources in a short time, and provide large expression datasets for functional genomic analysis. METHODOLOGY/PRINCIPAL FINDINGS: We used a next-generation high-throughput DNA sequencing technique with an Illumina GAIIx method to analyze the transcriptome from the whole bodies of C. fluminea. More than 62,250,336 high-quality reads were generated based on the raw data, and 134,684 unigenes with a mean length of 791 bp were assembled using the Velvet and Oases software. All of the assembly unigenes were annotated by running BLASTx and BLASTn similarity searches on the Nt, Nr, Swiss-Prot, COG and KEGG databases. In addition, the Clusters of Orthologous Groups (COGs, Gene Ontology (GO terms and Kyoto Encyclopedia of Gene and Genome (KEGG annotations were also assigned to each unigene transcript. To provide a preliminary verification of the assembly and annotation results, and search for potential environmental pollution biomarkers, 15 functional genes (five antioxidase genes, two cytochrome P450 genes, three GABA receptor-related genes and five heat shock protein genes were cloned and identified. Expressions of the 15 selected genes following fluoxetine exposure confirmed that the genes are indeed linked to environmental stress. CONCLUSIONS/SIGNIFICANCE: The C. fluminea transcriptome advances the underlying molecular understanding of this freshwater clam, provides a basis for further exploration of C. fluminea as an environmental test organism and promotes further studies on other bivalve organisms.

  17. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L. using sanger and next generation sequencing platforms: development and applications.

    Directory of Open Access Journals (Sweden)

    Himabindu Kudapa

    Full Text Available A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201, comprising 46,369 transcript assembly contigs (TACs has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8% of the TACs and gene ontology assignments were determined for 21,471 (46.3%. The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs and intron spanning regions (ISRs for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding

  18. Draft Genome Sequences of 12 Dry-Heat-Resistant Bacillus Strains Isolated from the Cleanrooms Where the Viking Spacecraft Were Assembled.

    Science.gov (United States)

    Seuylemezian, Arman; Cooper, Kerry; Schubert, Wayne; Vaishampayan, Parag

    2018-03-22

    Spore-forming microorganisms are of concern for forward contamination because they can survive harsh interplanetary travel. Here, we report the draft genome sequences of 12 spore-forming strains isolated from the Manned Spacecraft Operations Building (MSOB) and the Vehicle Assembly Building (VAB) in Cape Canaveral, FL, where the Viking spacecraft were assembled. Copyright © 2018 Seuylemezian et al.

  19. Mathematical model and metaheuristics for simultaneous balancing and sequencing of a robotic mixed-model assembly line

    Science.gov (United States)

    Li, Zixiang; Janardhanan, Mukund Nilakantan; Tang, Qiuhua; Nielsen, Peter

    2018-05-01

    This article presents the first method to simultaneously balance and sequence robotic mixed-model assembly lines (RMALB/S), which involves three sub-problems: task assignment, model sequencing and robot allocation. A new mixed-integer programming model is developed to minimize makespan and, using CPLEX solver, small-size problems are solved for optimality. Two metaheuristics, the restarted simulated annealing algorithm and co-evolutionary algorithm, are developed and improved to address this NP-hard problem. The restarted simulated annealing method replaces the current temperature with a new temperature to restart the search process. The co-evolutionary method uses a restart mechanism to generate a new population by modifying several vectors simultaneously. The proposed algorithms are tested on a set of benchmark problems and compared with five other high-performing metaheuristics. The proposed algorithms outperform their original editions and the benchmarked methods. The proposed algorithms are able to solve the balancing and sequencing problem of a robotic mixed-model assembly line effectively and efficiently.

  20. Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

    Science.gov (United States)

    Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

    2003-01-01

    Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375

  1. Human Contamination in Public Genome Assemblies.

    Science.gov (United States)

    Kryukov, Kirill; Imanishi, Tadashi

    2016-01-01

    Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.

  2. Allelic sequence variations in the hypervariable region of a T-cell receptor β chain: Correlation with restriction fragment length polymorphism in human families and populations

    International Nuclear Information System (INIS)

    Robinson, M.A.

    1989-01-01

    Direct sequence analysis of the human T-cell antigen receptor (TCR) V β1 variable gene identified a single base-pair allelic variation (C/G) located within the coding region. This change results in substitution of a histidine (CAC) for a glutamine (CAG) at position 48 of the TCR β chain, a position predicted to be in the TCR antigen binding site. The V β1 polymorphism was found by DNA sequence analysis of V β1 genes from seven unrelated individuals; V β1 genes were amplified by the polymerase chain reaction, the amplified fragments were cloned into M13 phage vectors, and sequences were determined. To determined the inheritance patterns of the V β1 substitution and to test correlation with V β1 restriction fragment length polymorphism detected with Pvu II and Taq I, allele-specific oligonucleotides were constructed and used to characterize amplified DNA samples. Seventy unrelated individuals and six families were tested for both restriction fragment length polymorphism and for the V β1 substitution. The correlation was also tested using amplified, size-selected, Pvu II- and Taq I-digested DNA samples from heterozygotes. Pvu II allele 1 (61/70) and Taq I allele 1 (66/70) were found to be correlated with the substitution giving rise to a histidine at position 48. Because there are exceptions to the correlation, the use of specific probes to characterize allelic forms of TCR variable genes will provide important tools for studies of basic TCR genetics and disease associations

  3. Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information

    Science.gov (United States)

    2012-01-01

    Background Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton. Results In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium. Conclusion This study will serve as a valuable genomic resource

  4. Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data.

    Science.gov (United States)

    Jayakumar, Vasanthan; Sakakibara, Yasubumi

    2017-11-03

    Long reads obtained from third-generation sequencing platforms can help overcome the long-standing challenge of the de novo assembly of sequences for the genomic analysis of non-model eukaryotic organisms. Numerous long-read-aided de novo assemblies have been published recently, which exhibited superior quality of the assembled genomes in comparison with those achieved using earlier second-generation sequencing technologies. Evaluating assemblies is important in guiding the appropriate choice for specific research needs. In this study, we evaluated 10 long-read assemblers using a variety of metrics on Pacific Biosciences (PacBio) data sets from different taxonomic categories with considerable differences in genome size. The results allowed us to narrow down the list to a few assemblers that can be effectively applied to eukaryotic assembly projects. Moreover, we highlight how best to use limited genomic resources for effectively evaluating the genome assemblies of non-model organisms. © The Author 2017. Published by Oxford University Press.

  5. Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery

    Directory of Open Access Journals (Sweden)

    Materne Michael

    2011-05-01

    Full Text Available Abstract Background Lentil (Lens culinaris Medik. is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Results Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs. De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. Conclusions A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

  6. Harnessing NGS and Big Data Optimally: Comparison of miRNA Prediction from Assembled versus Non-assembled Sequencing Data--The Case of the Grass Aegilops tauschii Complex Genome.

    Science.gov (United States)

    Budak, Hikmet; Kantar, Melda

    2015-07-01

    MicroRNAs (miRNAs) are small, endogenous, non-coding RNA molecules that regulate gene expression at the post-transcriptional level. As high-throughput next generation sequencing (NGS) and Big Data rapidly accumulate for various species, efforts for in silico identification of miRNAs intensify. Surprisingly, the effect of the input genomics sequence on the robustness of miRNA prediction was not evaluated in detail to date. In the present study, we performed a homology-based miRNA and isomiRNA prediction of the 5D chromosome of bread wheat progenitor, Aegilops tauschii, using two distinct sequence data sets as input: (1) raw sequence reads obtained from 454-GS FLX Titanium sequencing platform and (2) an assembly constructed from these reads. We also compared this method with a number of available plant sequence datasets. We report here the identification of 62 and 22 miRNAs from raw reads and the assembly, respectively, of which 16 were predicted with high confidence from both datasets. While raw reads promoted sensitivity with the high number of miRNAs predicted, 55% (12 out of 22) of the assembly-based predictions were supported by previous observations, bringing specificity forward compared to the read-based predictions, of which only 37% were supported. Importantly, raw reads could identify several repeat-related miRNAs that could not be detected with the assembly. However, raw reads could not capture 6 miRNAs, for which the stem-loops could only be covered by the relatively longer sequences from the assembly. In summary, the comparison of miRNA datasets obtained by these two strategies revealed that utilization of raw reads, as well as assemblies for in silico prediction, have distinct advantages and disadvantages. Consideration of these important nuances can benefit future miRNA identification efforts in the current age of NGS and Big Data driven life sciences innovation.

  7. Genotyping of major histocompatibility complex Class II DRB gene in Rohilkhandi goats by polymerase chain reaction-restriction fragment length polymorphism and DNA sequencing

    Directory of Open Access Journals (Sweden)

    Kush Shrivastava

    2015-10-01

    Full Text Available Aim: To study the major histocompatibility complex (MHC Class II DRB1 gene polymorphism in Rohilkhandi goat using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP and nucleotide sequencing techniques. Materials and Methods: DNA was isolated from 127 Rohilkhandi goats maintained at sheep and goat farm, Indian Veterinary Research Institute, Izatnagar, Bareilly. A 284 bp fragment of exon 2 of DRB1 gene was amplified and digested using BsaI and TaqI restriction enzymes. Population genetic parameters were calculated using Popgene v 1.32 and SAS 9.0. The genotypes were then sequenced using Sanger dideoxy chain termination method and were compared with related breeds/species using MEGA 6.0 and Megalign (DNASTAR software. Results: TaqI locus showed three and BsaI locus showed two genotypes. Both the loci were found to be in Hardy–Weinberg equilibrium (HWE, however, population genetic parameters suggest that heterozygosity is still maintained in the population at both loci. Percent diversity and divergence matrix, as well as phylogenetic analysis revealed that the MHC Class II DRB1 gene of Rohilkhandi goats was found to be in close cluster with Garole and Scottish blackface sheep breeds as compared to other goat breeds included in the sequence comparison. Conclusion: The PCR-RFLP patterns showed population to be in HWE and absence of one genotype at one locus (BsaI, both the loci showed excess of one or the other homozygote genotype, however, effective number of alleles showed that allelic diversity is present in the population. Sequence comparison of DRB1 gene of Rohilkhandi goat with other sheep and goat breed assigned Rohilkhandi goat in divergence with Jamanupari and Angora goats.

  8. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Liu, Siyang; Huang, Shujia; Rao, Junhua

    2015-01-01

    present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome......) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We...... assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction...

  9. Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals

    DEFF Research Database (Denmark)

    Hellmann, Ines; Mang, Yuan; Gu, Zhiping

    2008-01-01

    We introduce a simple, broadly applicable method for obtaining estimates of nucleotide diversity from genomic shotgun sequencing data. The method takes into account the special nature of these data: random sampling of genomic segments from one or more individuals and a relatively high error rate...... for individual reads. Applying this method to data from the Celera human genome sequencing and SNP discovery project, we obtain estimates of nucleotide diversity in windows spanning the human genome and show that the diversity to divergence ratio is reduced in regions of low recombination. Furthermore, we show...

  10. The chaperonin-60 universal target is a barcode for bacteria that enables de novo assembly of metagenomic sequence data.

    Science.gov (United States)

    Links, Matthew G; Dumonceaux, Tim J; Hemmingsen, Sean M; Hill, Janet E

    2012-01-01

    Barcoding with molecular sequences is widely used to catalogue eukaryotic biodiversity. Studies investigating the community dynamics of microbes have relied heavily on gene-centric metagenomic profiling using two genes (16S rRNA and cpn60) to identify and track Bacteria. While there have been criteria formalized for barcoding of eukaryotes, these criteria have not been used to evaluate gene targets for other domains of life. Using the framework of the International Barcode of Life we evaluated DNA barcodes for Bacteria. Candidates from the 16S rRNA gene and the protein coding cpn60 gene were evaluated. Within complete bacterial genomes in the public domain representing 983 species from 21 phyla, the largest difference between median pairwise inter- and intra-specific distances ("barcode gap") was found from cpn60. Distribution of sequence diversity along the ∼555 bp cpn60 target region was remarkably uniform. The barcode gap of the cpn60 universal target facilitated the faithful de novo assembly of full-length operational taxonomic units from pyrosequencing data from a synthetic microbial community. Analysis supported the recognition of both 16S rRNA and cpn60 as DNA barcodes for Bacteria. The cpn60 universal target was found to have a much larger barcode gap than 16S rRNA suggesting cpn60 as a preferred barcode for Bacteria. A large barcode gap for cpn60 provided a robust target for species-level characterization of data. The assembly of consensus sequences for barcodes was shown to be a reliable method for the identification and tracking of novel microbes in metagenomic studies.

  11. Genome Sequence, Assembly and Characterization of Two Metschnikowia fructicola Strains Used as Biocontrol Agents of Postharvest Diseases

    Directory of Open Access Journals (Sweden)

    Edoardo Piombo

    2018-04-01

    Full Text Available The yeast Metschnikowia fructicola was reported as an efficient biological control agent of postharvest diseases of fruits and vegetables, and it is the bases of the commercial formulated product “Shemer.” Several mechanisms of action by which M. fructicola inhibits postharvest pathogens were suggested including iron-binding compounds, induction of defense signaling genes, production of fungal cell wall degrading enzymes and relatively high amounts of superoxide anions. We assembled the whole genome sequence of two strains of M. fructicola using PacBio and Illumina shotgun sequencing technologies. Using the PacBio, a high-quality draft genome consisting of 93 contigs, with an estimated genome size of approximately 26 Mb, was obtained. Comparative analysis of M. fructicola proteins with the other three available closely related genomes revealed a shared core of homologous proteins coded by 5,776 genes. Comparing the genomes of the two M. fructicola strains using a SNP calling approach resulted in the identification of 564,302 homologous SNPs with 2,004 predicted high impact mutations. The size of the genome is exceptionally high when compared with those of available closely related organisms, and the high rate of homology among M. fructicola genes points toward a recent whole-genome duplication event as the cause of this large genome. Based on the assembled genome, sequences were annotated with a gene description and gene ontology (GO term and clustered in functional groups. Analysis of CAZymes family genes revealed 1,145 putative genes, and transcriptomic analysis of CAZyme expression levels in M. fructicola during its interaction with either grapefruit peel tissue or Penicillium digitatum revealed a high level of CAZyme gene expression when the yeast was placed in wounded fruit tissue.

  12. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    Science.gov (United States)

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  13. De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries.

    Science.gov (United States)

    Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee

    2015-09-21

    Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.

  14. IG and TR single chain fragment variable (scFv) sequence analysis: a new advanced functionality of IMGT/V-QUEST and IMGT/HighV-QUEST.

    Science.gov (United States)

    Giudicelli, Véronique; Duroux, Patrice; Kossida, Sofia; Lefranc, Marie-Paule

    2017-06-26

    IMGT®, the international ImMunoGeneTics information system® ( http://www.imgt.org ), was created in 1989 in Montpellier, France (CNRS and Montpellier University) to manage the huge and complex diversity of the antigen receptors, and is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. Immunoglobulins (IG) or antibodies and T cell receptors (TR) are managed and described in the IMGT® databases and tools at the level of receptor, chain and domain. The analysis of the IG and TR variable (V) domain rearranged nucleotide sequences is performed by IMGT/V-QUEST (online since 1997, 50 sequences per batch) and, for next generation sequencing (NGS), by IMGT/HighV-QUEST, the high throughput version of IMGT/V-QUEST (portal begun in 2010, 500,000 sequences per batch). In vitro combinatorial libraries of engineered antibody single chain Fragment variable (scFv) which mimic the in vivo natural diversity of the immune adaptive responses are extensively screened for the discovery of novel antigen binding specificities. However the analysis of NGS full length scFv (~850 bp) represents a challenge as they contain two V domains connected by a linker and there is no tool for the analysis of two V domains in a single chain. The functionality "Analyis of single chain Fragment variable (scFv)" has been implemented in IMGT/V-QUEST and, for NGS, in IMGT/HighV-QUEST for the analysis of the two V domains of IG and TR scFv. It proceeds in five steps: search for a first closest V-REGION, full characterization of the first V-(D)-J-REGION, then search for a second V-REGION and full characterization of the second V-(D)-J-REGION, and finally linker delimitation. For each sequence or NGS read, positions of the 5'V-DOMAIN, linker and 3'V-DOMAIN in the scFv are provided in the 'V-orientated' sense. Each V-DOMAIN is fully characterized (gene identification, sequence description, junction analysis, characterization of mutations and amino

  15. TAGUCHI METHOD FOR THREE-STAGE ASSEMBLY FLOW SHOP SCHEDULING PROBLEM WITH BLOCKING AND SEQUENCE-DEPENDENT SET UP TIMES

    Directory of Open Access Journals (Sweden)

    AREF MALEKI-DARONKOLAEI

    2013-10-01

    Full Text Available This article considers a three-stage assembly flowshop scheduling problem minimizing the weighted sum of mean completion time and makespan with sequence-dependent setup times at the first stage and blocking times between each stage. To tackle such an NP-hard, two meta-heuristic algorithms are presented. The novelty of our approach is to develop a variable neighborhood search algorithm (VNS and a well-known simulated annealing (SA for the problem. Furthermore, to enhance the performance of the (SA, its parameters are optimized by the use of Taguchi method, but to setting parameters of VNS just one parameter has been used without Taguchi. The computational results show that the proposed VNS is better in mean and standard deviation for all sizes of the problem than SA, but on the contrary about CPU Time SA outperforms VNS.

  16. Rational Design of High-Number dsDNA Fragments Based on Thermodynamics for the Construction of Full-Length Genes in a Single Reaction.

    Science.gov (United States)

    Birla, Bhagyashree S; Chou, Hui-Hsien

    2015-01-01

    Gene synthesis is frequently used in modern molecular biology research either to create novel genes or to obtain natural genes when the synthesis approach is more flexible and reliable than cloning. DNA chemical synthesis has limits on both its length and yield, thus full-length genes have to be hierarchically constructed from synthesized DNA fragments. Gibson Assembly and its derivatives are the simplest methods to assemble multiple double-stranded DNA fragments. Currently, up to 12 dsDNA fragments can be assembled at once with Gibson Assembly according to its vendor. In practice, the number of dsDNA fragments that can be assembled in a single reaction are much lower. We have developed a rational design method for gene construction that allows high-number dsDNA fragments to be assembled into full-length genes in a single reaction. Using this new design method and a modified version of the Gibson Assembly protocol, we have assembled 3 different genes from up to 45 dsDNA fragments at once. Our design method uses the thermodynamic analysis software Picky that identifies all unique junctions in a gene where consecutive DNA fragments are specifically made to connect to each other. Our novel method is generally applicable to most gene sequences, and can improve both the efficiency and cost of gene assembly.

  17. Rational Design of High-Number dsDNA Fragments Based on Thermodynamics for the Construction of Full-Length Genes in a Single Reaction.

    Directory of Open Access Journals (Sweden)

    Bhagyashree S Birla

    Full Text Available Gene synthesis is frequently used in modern molecular biology research either to create novel genes or to obtain natural genes when the synthesis approach is more flexible and reliable than cloning. DNA chemical synthesis has limits on both its length and yield, thus full-length genes have to be hierarchically constructed from synthesized DNA fragments. Gibson Assembly and its derivatives are the simplest methods to assemble multiple double-stranded DNA fragments. Currently, up to 12 dsDNA fragments can be assembled at once with Gibson Assembly according to its vendor. In practice, the number of dsDNA fragments that can be assembled in a single reaction are much lower. We have developed a rational design method for gene construction that allows high-number dsDNA fragments to be assembled into full-length genes in a single reaction. Using this new design method and a modified version of the Gibson Assembly protocol, we have assembled 3 different genes from up to 45 dsDNA fragments at once. Our design method uses the thermodynamic analysis software Picky that identifies all unique junctions in a gene where consecutive DNA fragments are specifically made to connect to each other. Our novel method is generally applicable to most gene sequences, and can improve both the efficiency and cost of gene assembly.

  18. De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein using transcriptome sequences.

    Directory of Open Access Journals (Sweden)

    Dan-Dan Wei

    Full Text Available BACKGROUND: As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. METHODOLOGY/PRINCIPAL FINDINGS: We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61% unigenes were matched to known proteins in the NCBI non-redundant (Nr protein database. These unigenes were further functionally annotated with gene ontology (GO, cluster of orthologous groups of proteins (COG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST genes, 19 putative carboxyl/cholinesterase (CCE genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. CONCLUSIONS/SIGNIFICANCE: We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying

  19. Draft sequencing and assembly of the genome of the world's largest fish, the whale shark: Rhincodon typus Smith 1828.

    Science.gov (United States)

    Read, Timothy D; Petit, Robert A; Joseph, Sandeep J; Alam, Md Tauqeer; Weil, M Ryan; Ahmad, Maida; Bhimani, Ravila; Vuong, Jocelyn S; Haase, Chad P; Webb, D Harry; Tan, Milton; Dove, Alistair D M

    2017-07-14

    The whale shark (Rhincodon typus) has by far the largest body size of any elasmobranch (shark or ray) species. Therefore, it is also the largest extant species of the paraphyletic assemblage commonly referred to as fishes. As both a phenotypic extreme and a member of the group Chondrichthyes - the sister group to the remaining gnathostomes, which includes all tetrapods and therefore also humans - its genome is of substantial comparative interest. Whale sharks are also listed as an endangered species on the International Union for Conservation of Nature's Red List of threatened species and are of growing popularity as both a target of ecotourism and as a charismatic conservation ambassador for the pelagic ecosystem. A genome map for this species would aid in defining effective conservation units and understanding global population structure. We characterised the nuclear genome of the whale shark using next generation sequencing (454, Illumina) and de novo assembly and annotation methods, based on material collected from the Georgia Aquarium. The data set consisted of 878,654,233 reads, which yielded a draft assembly of 1,213,200 contigs and 997,976 scaffolds. The estimated genome size was 3.44Gb. As expected, the proteome of the whale shark was most closely related to the only other complete genome of a cartilaginous fish, the holocephalan elephant shark. The whale shark contained a novel Toll-like-receptor (TLR) protein with sequence similarity to both the TLR4 and TLR13 proteins of mammals and TLR21 of teleosts. The data are publicly available on GenBank, FigShare, and from the NCBI Short Read Archive under accession number SRP044374. This represents the first shotgun elasmobranch genome and will aid studies of molecular systematics, biogeography, genetic differentiation, and conservation genetics in this and other shark species, as well as providing comparative data for studies of evolutionary biology and immunology across the jawed vertebrate lineages.

  20. Sequence-Dependent Self-Assembly and Structural Diversity of Islet Amyloid Polypeptide-Derived β-Sheet Fibrils

    International Nuclear Information System (INIS)

    Wang, Shih-Ting; Lin, Yiyang; Spencer, Ryan K.; Thomas, Michael R.; Nguyen, Andy I.

    2017-01-01

    Determining the structural origins of amyloid fibrillation is essential for understanding both the pathology of amyloidosis and the rational design of inhibitors to prevent or reverse amyloid formation. In this work, the decisive roles of peptide structures on amyloid self-assembly and morphological diversity were investigated by the design of eight amyloidogenic peptides derived from islet amyloid polypeptide. Among the segments, two distinct morphologies were highlighted in the form of twisted and planar (untwisted) ribbons with varied diameters, thicknesses, and lengths. In particular, transformation of amyloid fibrils from twisted ribbons into untwisted structures was triggered by substitution of the C-terminal serine with threonine, where the side chain methyl group was responsible for the distinct morphological change. This effect was confirmed following serine substitution with alanine and valine and was ascribed to the restriction of intersheet torsional strain through the increased hydrophobic interactions and hydrogen bonding. We also studied the variation of fibril morphology (i.e., association and helicity) and peptide aggregation propensity by increasing the hydrophobicity of the peptide side group, capping the N-terminus, and extending sequence length. Lastly, we anticipate that our insights into sequence-dependent fibrillation and morphological diversity will shed light on the structural interpretation of amyloidogenesis and development of structure-specific imaging agents and aggregation inhibitors.

  1. Next-Generation Sequencing of Genomic DNA Fragments Bound to a Transcription Factor in Vitro Reveals Its Regulatory Potential

    Directory of Open Access Journals (Sweden)

    Yukio Kurihara

    2014-12-01

    Full Text Available Several transcription factors (TFs coordinate to regulate expression of specific genes at the transcriptional level. In Arabidopsis thaliana it is estimated that approximately 10% of all genes encode TFs or TF-like proteins. It is important to identify target genes that are directly regulated by TFs in order to understand the complete picture of a plant’s transcriptome profile. Here, we investigate the role of the LONG HYPOCOTYL5 (HY5 transcription factor that acts as a regulator of photomorphogenesis. We used an in vitro genomic DNA binding assay coupled with immunoprecipitation and next-generation sequencing (gDB-seq instead of the in vivo chromatin immunoprecipitation (ChIP-based methods. The results demonstrate that the HY5-binding motif predicted here was similar to the motif reported previously and that in vitro HY5-binding loci largely overlapped with the HY5-targeted candidate genes identified in previous ChIP-chip analysis. By combining these results with microarray analysis, we identified hundreds of HY5-binding genes that were differentially expressed in hy5. We also observed delayed induction of some transcripts of HY5-binding genes in hy5 mutants in response to blue-light exposure after dark treatment. Thus, an in vitro gDNA-binding assay coupled with sequencing is a convenient and powerful method to bridge the gap between identifying TF binding potential and establishing function.

  2. The carbohydrate-binding module (CBM)-like sequence is crucial for rice CWA1/BC1 function in proper assembly of secondary cell wall materials.

    Science.gov (United States)

    Sato, Kanna; Ito, Sachiko; Fujii, Takeo; Suzuki, Ryu; Takenouchi, Sachi; Nakaba, Satoshi; Funada, Ryo; Sano, Yuzou; Kajita, Shinya; Kitano, Hidemi; Katayama, Yoshihiro

    2010-11-01

    We recently reported that the cwa1 mutation disturbed the deposition and assembly of secondary cell wall materials in the cortical fiber of rice internodes. Genetic analysis revealed that cwa1 is allelic to bc1, which encodes glycosylphosphatidylinositol (GPI)-anchored COBRA-like protein with the highest homology to Arabidopsis COBRA-like 4 (COBL4) and maize Brittle Stalk 2 (Bk2). Our results suggested that CWA1/BC1 plays a role in assembling secondary cell wall materials at appropriate sites, enabling synthesis of highly ordered secondary cell wall structure with solid and flexible internodes in rice. The N-terminal amino acid sequence of CWA1/BC1, as well as its orthologs (COBL4, Bk2) and other BC1-like proteins in rice, shows weak similarity to a family II carbohydrate-binding module (CBM2) of several bacterial cellulases. To investigate the importance of the CBM-like sequence of CWA1/BC1 in the assembly of secondary cell wall materials, Trp residues in the CBM-like sequence, which is important for carbohydrate binding, were substituted for Val residues and introduced into the cwa1 mutant. CWA1/BC1 with the mutated sequence did not complement the abnormal secondary cell walls seen in the cwa1 mutant, indicating that the CBM-like sequence is essential for the proper function of CWA1/BC1, including assembly of secondary cell wall materials.

  3. Characterization of Erwinia amylovora strains from different host plants using repetitive-sequences PCR analysis, and restriction fragment length polymorphism and short-sequence DNA repeats of plasmid pEA29.

    Science.gov (United States)

    Barionovi, D; Giorgi, S; Stoeger, A R; Ruppitsch, W; Scortichini, M

    2006-05-01

    The three main aims of the study were the assessment of the genetic relationship between a deviating Erwinia amylovora strain isolated from Amelanchier sp. (Maloideae) grown in Canada and other strains from Maloideae and Rosoideae, the investigation of the variability of the PstI fragment of the pEA29 plasmid using restriction fragment length polymorphism (RFLP) analysis and the determination of the number of short-sequence DNA repeats (SSR) by DNA sequence analysis in representative strains. Ninety-three strains obtained from 12 plant genera and different geographical locations were examined by repetitive-sequences PCR using Enterobacterial Repetitive Intergenic Consensus, BOX and Repetitive Extragenic Palindromic primer sets. Upon the unweighted pair group method with arithmetic mean analysis, a deviating strain from Amelanchier sp. was analysed using amplified ribosomal DNA restriction analysis (ARDRA) analysis and the sequencing of the 16S rDNA gene. This strain showed 99% similarity to other E. amylovora strains in the 16S gene and the same banding pattern with ARDRA. The RFLP analysis of pEA29 plasmid using MspI and Sau3A restriction enzymes showed a higher variability than that previously observed and no clear-cut grouping of the strains was possible. The number of SSR units reiterated two to 12 times. The strains obtained from pear orchards showing for the first time symptoms of fire blight had a low number of SSR units. The strains from Maloideae exhibit a wider genetic variability than previously thought. The RFLP analysis of a fragment of the pEA29 plasmid would not seem a reliable method for typing E. amylovora strains. A low number of SSR units was observed with first epidemics of fire blight. The current detection techniques are mainly based on the genetic similarities observed within the strains from the cultivated tree-fruit crops. For a more reliable detection of the fire blight pathogen also in wild and ornamentals Rosaceous plants the genetic

  4. DNA-PK dependent targeting of DNA-ends to a protein complex assembled on matrix attachment region DNA sequences

    International Nuclear Information System (INIS)

    Mauldin, S.K.; Getts, R.C.; Perez, M.L.; DiRienzo, S.; Stamato, T.D.

    2003-01-01

    Full text: We find that nuclear protein extracts from mammalian cells contain an activity that allows DNA ends to associate with circular pUC18 plasmid DNA. This activity requires the catalytic subunit of DNA-PK (DNA-PKcs) and Ku since it was not observed in mutants lacking Ku or DNA-PKcs but was observed when purified Ku/DNA-PKcs was added to these mutant extracts. Competition experiments between pUC18 and pUC18 plasmids containing various nuclear matrix attachment region (MAR) sequences suggest that DNA ends preferentially associate with plasmids containing MAR DNA sequences. At a 1:5 mass ratio of MAR to pUC18, approximately equal amounts of DNA end binding to the two plasmids were observed, while at a 1:1 ratio no pUC18 end-binding was observed. Calculation of relative binding activities indicates that DNA-end binding activities to MAR sequences was 7 to 21 fold higher than pUC18. Western analysis of proteins bound to pUC18 and MAR plasmids indicates that XRCC4, DNA ligase IV, scaffold attachment factor A, topoisomerase II, and poly(ADP-ribose) polymerase preferentially associate with the MAR plasmid in the absence or presence of DNA ends. In contrast, Ku and DNA-PKcs were found on the MAR plasmid only in the presence of DNA ends. After electroporation of a 32P-labeled DNA probe into human cells and cell fractionation, 87% of the total intercellular radioactivity remained in nuclei after a 0.5M NaCl extraction suggesting the probe was strongly bound in the nucleus. The above observations raise the possibility that DNA-PK targets DNA-ends to a repair and/or DNA damage signaling complex which is assembled on MAR sites in the nucleus

  5. Paleoproterozoic (ca. 1.8 Ga) arc magmatism in the Lützow-Holm Complex, East Antarctica: Implications for crustal growth and terrane assembly in erstwhile Gondwana fragments

    Science.gov (United States)

    Takahashi, Kazuki; Tsunogae, Toshiaki; Santosh, M.; Takamura, Yusuke; Tsutsumi, Yukiyasu

    2018-05-01

    lithological data from the region, suggest that the LHC can be divided into three units: Neoarchean (ca. 2.5 Ga) unit in the southern LHC (Shirase Orthogneiss or "Shirase microcontinent"), Neoproterozoic (ca. 1.0 Ga) unit in the northern LHC, and supracrustal unit in the central LHC with fragments of Paleoproterozoic (ca. 1.8 Ga) and minor Neoarchean (ca. 2.5 Ga) and Neoproterozoic (ca. 1.0 Ga) magmatic arcs. The 1.8 Ga arc magmatism inferred in this study has also been reported from adjacent Gondwana fragments such as the Highland Complex in Sri Lanka, and the Trivandrum and Nagercoil Blocks in southern India. Although the ca. 1.8 Ga arc-magmatic event is coeval in these regions, the Paleoproterozoic supracrustal unit in the central LHC may not be contiguous with those in the Highland Complex of Sri Lanka because recent studies have shown that the Vijayan Complex in Sri Lanka and the ca. 1.0 Ga northern LHC possibly were part of a single crustal unit (northern Lützow-Holm-Vijayan Complex) within the Kalahari Block. The supracrustal unit possibly marks part of a discrete suture formed by the collision of the ca. 2.5 Ga southern LHC (Shirase microcontinent) and the ca. 1.0 Ga northern Lützow-Holm-Vijayan Complex during the latest Neoproterozoic-Cambrian Gondwana amalgamation, which might be coeval with the collision of the Vijayan and Wanni Complexes and the formation of the Highland Complex in Sri Lanka. Our study provides new insights on crustal growth and terrane assembly in the ancient continental blocks of Gondwana.

  6. Prediction of Scylla olivacea (Crustacea; Brachyura) peptide hormones using publicly accessible transcriptome shotgun assembly (TSA) sequences.

    Science.gov (United States)

    Christie, Andrew E

    2016-05-01

    The aquaculture of crabs from the genus Scylla is of increasing economic importance for many Southeast Asian countries. Expansion of Scylla farming has led to increased efforts to understand the physiology and behavior of these crabs, and as such, there are growing molecular resources for them. Here, publicly accessible Scylla olivacea transcriptomic data were mined for putative peptide-encoding transcripts; the proteins deduced from the identified sequences were then used to predict the structures of mature peptide hormones. Forty-nine pre/preprohormone-encoding transcripts were identified, allowing for the prediction of 187 distinct mature peptides. The identified peptides included isoforms of adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin B, allatostatin C, bursicon β, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone/molt-inhibiting hormone, diuretic hormone 31, eclosion hormone, FMRFamide-like peptide, HIGSLYRamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, pyrokinin, red pigment concentrating hormone, RYamide, short neuropeptide F, SIFamide and tachykinin-related peptide, all well-known neuropeptide families. Surprisingly, the tissue used to generate the transcriptome mined here is reported to be testis. Whether or not the testis samples had neural contamination is unknown. However, if the peptides are truly produced by this reproductive organ, it could have far reaching consequences for the study of crustacean endocrinology, particularly in the area of reproductive control. Regardless, this peptidome is the largest thus far predicted for any brachyuran (true crab) species, and will serve as a foundation for future studies of peptidergic control in members of the commercially important genus Scylla. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Construction of an SNP-based high-density linkage map for flax (Linum usitatissimum L.) using specific length amplified fragment sequencing (SLAF-seq) technology.

    Science.gov (United States)

    Yi, Liuxi; Gao, Fengyun; Siqin, Bateer; Zhou, Yu; Li, Qiang; Zhao, Xiaoqing; Jia, Xiaoyun; Zhang, Hui

    2017-01-01

    Flax is an important crop for oil and fiber, however, no high-density genetic maps have been reported for this species. Specific length amplified fragment sequencing (SLAF-seq) is a high-resolution strategy for large scale de novo discovery and genotyping of single nucleotide polymorphisms. In this study, SLAF-seq was employed to develop SNP markers in an F2 population to construct a high-density genetic map for flax. In total, 196.29 million paired-end reads were obtained. The average sequencing depth was 25.08 in male parent, 32.17 in the female parent, and 9.64 in each F2 progeny. In total, 389,288 polymorphic SLAFs were detected, from which 260,380 polymorphic SNPs were developed. After filtering, 4,638 SNPs were found suitable for genetic map construction. The final genetic map included 4,145 SNP markers on 15 linkage groups and was 2,632.94 cM in length, with an average distance of 0.64 cM between adjacent markers. To our knowledge, this map is the densest SNP-based genetic map for flax. The SNP markers and genetic map reported in here will serve as a foundation for the fine mapping of quantitative trait loci (QTLs), map-based gene cloning and marker assisted selection (MAS) for flax.

  8. Identification of two invasive Cacopsylla chinensis (Hemiptera: Psyllidae) lineages based on two mitochondrial sequences and restriction fragment length polymorphism of cytochrome oxidase I amplicon.

    Science.gov (United States)

    Lee, Hsien-Chung; Yang, Man-Miao; Yeh, Wen-Bin

    2008-08-01

    The occurrence of pear decline, a disease found in some pear (Pyrus spp.) orchards of Taiwan in recent years, is accompanied by an outbreak of Cacopsylla chinensis (Yang & Li). Two major morphological forms (summer and winter forms) with a variety of intermediate body color and two phylogenetic lineages of this psyllid have been described. The work herein used sequences of mitochondrial cytochrome oxidase I (COI) and 16S rDNA regions to delineate the genetic differentiation of this color-variable insect and to elucidate their relationship. Sequence divergence and phylogenetic analysis have shown that C. chinensis individuals could be divided into two lineages with 3.3 and 2.3% divergence of COI and 16S rDNA, respectively. All specimens from China were found to belong to lineage I. Restriction fragment length polymorphism analysis of COI with restriction enzymes AcuI, AseI, BccI, and FokI on 263 specimens of six populations from Taiwan produced two digestion patterns, which are in agreement with the two lineages described above. Both patterns could be found in each population, with most individuals belonging to lineage I and 5-21% of the individuals belonging to lineage II. Because these two lineages included summer as well as winter morphological forms, the lineage differentiation is apparently not related to morphological characters of this psyllid. Because the invasive records are not in favor of a sympatric differentiation, this psyllid is more likely introduced as different populations from countries in temperate regions.

  9. Genome-Wide Single-Nucleotide Polymorphisms Discovery and High-Density Genetic Map Construction in Cauliflower Using Specific-Locus Amplified Fragment Sequencing

    Science.gov (United States)

    Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan

    2016-01-01

    Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515

  10. Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms

    Directory of Open Access Journals (Sweden)

    Haznedaroglu Berat Z

    2012-07-01

    Full Text Available Abstract Background The k-mer hash length is a key factor affecting the output of de novo transcriptome assembly packages using de Bruijn graph algorithms. Assemblies constructed with varying single k-mer choices might result in the loss of unique contiguous sequences (contigs and relevant biological information. A common solution to this problem is the clustering of single k-mer assemblies. Even though annotation is one of the primary goals of a transcriptome assembly, the success of assembly strategies does not consider the impact of k-mer selection on the annotation output. This study provides an in-depth k-mer selection analysis that is focused on the degree of functional annotation achieved for a non-model organism where no reference genome information is available. Individual k-mers and clustered assemblies (CA were considered using three representative software packages. Pair-wise comparison analyses (between individual k-mers and CAs were produced to reveal missing Kyoto Encyclopedia of Genes and Genomes (KEGG ortholog identifiers (KOIs, and to determine a strategy that maximizes the recovery of biological information in a de novo transcriptome assembly. Results Analyses of single k-mer assemblies resulted in the generation of various quantities of contigs and functional annotations within the selection window of k-mers (k-19 to k-63. For each k-mer in this window, generated assemblies contained certain unique contigs and KOIs that were not present in the other k-mer assemblies. Producing a non-redundant CA of k-mers 19 to 63 resulted in a more complete functional annotation than any single k-mer assembly. However, a fraction of unique annotations remained (~0.19 to 0.27% of total KOIs in the assemblies of individual k-mers (k-19 to k-63 that were not present in the non-redundant CA. A workflow to recover these unique annotations is presented. Conclusions This study demonstrated that different k-mer choices result in various quantities

  11. Characterization of primary biogenic aerosol particles in urban, rural, and high-alpine air by DNA sequence and restriction fragment analysis of ribosomal RNA genes

    Directory of Open Access Journals (Sweden)

    V. R. Després

    2007-12-01

    Full Text Available This study explores the applicability of DNA analyses for the characterization of primary biogenic aerosol (PBA particles in the atmosphere. Samples of fine particulate matter (PM2.5 and total suspended particulates (TSP have been collected on different types of filter materials at urban, rural, and high-alpine locations along an altitude transect in the south of Germany (Munich, Hohenpeissenberg, Mt. Zugspitze.

    From filter segments loaded with about one milligram of air particulate matter, DNA could be extracted and DNA sequences could be determined for bacteria, fungi, plants and animals. Sequence analyses were used to determine the identity of biological organisms, and terminal restriction fragment length polymorphism analyses (T-RFLP were applied to estimate diversities and relative abundances of bacteria. Investigations of blank and background samples showed that filter materials have to be decontaminated prior to use, and that the sampling and handling procedures have to be carefully controlled to avoid artifacts in the analyses.

    Mass fractions of DNA in PM2.5 were found to be around 0.05% in urban, rural, and high-alpine aerosols. The average concentration of DNA determined for urban air was on the order of ~7 ng m−3, indicating that human adults may inhale about one microgram of DNA per day (corresponding to ~108 haploid bacterial genomes or ~105 haploid human genomes, respectively.

    Most of the bacterial sequences found in PM2.5 were from Proteobacteria (42 and some from Actinobacteria (10 and Firmicutes (1. The fungal sequences were characteristic for Ascomycota (3 and Basidiomycota (1, which are known to actively discharge spores into the atmosphere. The plant sequences could be attributed to green plants (2 and moss spores (2, while animal DNA was found only for one unicellular eukaryote (protist.

  12. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  13. Detection and Resolution of Cryptosporidium Species and Species Mixtures by Genus-Specific Nested PCR-Restriction Fragment Length Polymorphism Analysis, Direct Sequencing, and Cloning ▿

    Science.gov (United States)

    Ruecker, Norma J.; Hoffman, Rebecca M.; Chalmers, Rachel M.; Neumann, Norman F.

    2011-01-01

    Molecular methods incorporating nested PCR-restriction fragment length polymorphism (RFLP) analysis of the 18S rRNA gene of Cryptosporidium species were validated to assess performance based on limit of detection (LoD) and for detecting and resolving mixtures of species and genotypes within a single sample. The 95% LoD was determined for seven species (Cryptosporidium hominis, C. parvum, C. felis, C. meleagridis, C. ubiquitum, C. muris, and C. andersoni) and ranged from 7 to 11 plasmid template copies with overlapping 95% confidence limits. The LoD values for genomic DNA from oocysts on microscope slides were 7 and 10 template copies for C. andersoni and C. parvum, respectively. The repetitive nested PCR-RFLP slide protocol had an LoD of 4 oocysts per slide. When templates of two species were mixed in equal ratios in the nested PCR-RFLP reaction mixture, there was no amplification bias toward one species over another. At high ratios of template mixtures (>1:10), there was a reduction or loss of detection of the less abundant species by RFLP analysis, most likely due to heteroduplex formation in the later cycles of the PCR. Replicate nested PCR was successful at resolving many mixtures of Cryptosporidium at template concentrations near or below the LoD. The cloning of nested PCR products resulted in 17% of the cloned sequences being recombinants of the two original templates. Limiting-dilution nested PCR followed by the sequencing of PCR products resulted in no sequence anomalies, suggesting that this method is an effective and accurate way to study the species diversity of Cryptosporidium, particularly for environmental water samples, in which mixtures of parasites are common. PMID:21498746

  14. Bacteriophage Assembly

    Directory of Open Access Journals (Sweden)

    Anastasia A. Aksyuk

    2011-02-01

    Full Text Available Bacteriophages have been a model system to study assembly processes for over half a century. Formation of infectious phage particles involves specific protein-protein and protein-nucleic acid interactions, as well as large conformational changes of assembly precursors. The sequence and molecular mechanisms of phage assembly have been elucidated by a variety of methods. Differences and similarities of assembly processes in several different groups of bacteriophages are discussed in this review. The general principles of phage assembly are applicable to many macromolecular complexes.

  15. De novo sequencing, assembly, and analysis of Iris lactea var. chinensis roots' transcriptome in response to salt stress.

    Science.gov (United States)

    Gu, Chunsun; Xu, Sheng; Wang, Zhiquan; Liu, Liangqin; Zhang, Yongxia; Deng, Yanming; Huang, Suzhen

    2018-04-01

    As a halophyte, Iris lactea var. chinensis (I. lactea var. chinensis) is widely distributed and has good drought and heavy metal resistance. Moreover, it is an excellent ornamental plant. I. lactea var. chinensis has extensive application prospects owing to the global impacts of salinization. To better understand its molecular mechanism involved in salt resistance, the de novo sequencing, assembly, and analysis of I. lactea var. chinensis roots' transcriptome in response to salt-stress conditions was performed. On average, 74.17% of the clean reads were mapped to unigenes. A total of 121,093 unigenes were constructed and 56,398 (46.57%) were annotated. Among these, 13,522 differentially expressed genes (DEGs) were identified between salt-treated and control samples Compared to the transcriptional level of control, 7037 DEGs were up-regulated and 6539 down-regulated. In addition, 129 up-regulated and 1609 down-regulated genes were simultaneously detected in all three pairwise comparisons between control and salt-stressed libraries. At least 247 and 250 DEGs encoding transcription factors and transporter proteins were identified. Meanwhile, 130 DEGs regarding reactive oxygen species (ROS) scavenging system were also summarized. Based on real-time quantitative RT-PCR, we verified the changes in the expression patterns of 10 unigenes. Our study identified potential salt-responsive candidate genes and increased the understanding of halophyte responses to salinity stress. Copyright © 2018 Elsevier Masson SAS. All rights reserved.

  16. Mining candidate genes associated with powdery mildew resistance in cucumber via super-BSA by specific length amplified fragment (SLAF) sequencing.

    Science.gov (United States)

    Zhang, Peng; Zhu, Yuqiang; Wang, Lili; Chen, Liping; Zhou, Shengjun

    2015-12-14

    Powdery mildew (PM) is the most common fungal disease of cucumber and other cucurbit crops, while breeding the PM-resistant materials is the effective way to defense this disease, and the recent development of modern genetics and genomics make us aware of that studying the resistance genes is the essential way to breed the PM high-resistance plant. With the ever increasing throughput of next-generation sequencing (NGS), the development of specific length amplified fragment sequencing (SLAF-seq) as a high-resolution strategy for large-scale de novo SNP discovery is gradually applied for functional gene mining. Here we combined the bulked segregant analysis (BSA) with SLAF-seq to identify candidate genes associated with PM resistance in cucumber. A segregating population comprising 251 F2 individuals was developed using H136 (female parent) as susceptible parent and BK2 (male parent) as resistance donor. After PMR test, total genomic DNA was prepared from each plant. Systemic genomic analysis of the GC content, repeat sequence, etc. was carried out by prediction software SLAF_Predict to establish condition to ensure the uniformity and density of the molecular markers. After samples were gel purified, SLAFs were generated at Biomarker Technologies Corporation in Beijing. Based on SLAF tags and the PMR test result, the hot region were annotated. A total of 73,100 high-quality SLAF tags with an average depth of 99.11× were sequenced. Among these, 5,355 polymorphic tags were identified with a polymorphism rate of 7.34 %, including 7.09 % SNPs and other polymorphism types. Finally, 140 associated SLAFs were identified, and two main Hot Regions were detected on chromosome 1 and 6, which contained five genes invovled in defense response, toxin metabolism, cell stress response, and injury response in cucumber. Associated markers identified by super-BSA in this study, could not only speed up the study of the PMR genes, but also provide a feasible solution for breeding the

  17. Comparative analysis of human cytomegalovirus a-sequence in multiple clinical isolates by using polymerase chain reaction and restriction fragment length polymorphism assays.

    Science.gov (United States)

    Zaia, J A; Gallez-Hawkins, G; Churchill, M A; Morton-Blackshere, A; Pande, H; Adler, S P; Schmidt, G M; Forman, S J

    1990-01-01

    The human cytomegalovirus (HCMV) a-sequence (a-seq) is located in the joining region between the long (L) and short (S) unique sequences of the virus (L-S junction), and this hypervariable junction has been used to differentiate HCMV strains. The purpose of this study was to investigate whether there are differences among strains of human cytomegalovirus which could be characterized by polymerase chain reaction (PCR) amplification of the a-seq of HCMV DNA and to compare a PCR method of strain differentiation with conventional restriction fragment length polymorphism (RFLP) methodology by using HCMV junction probes. Laboratory strains of HCMV and viral isolates from individuals with HCMV infection were characterized by using both RFLPs and PCR. The PCR assay amplified regions in the major immediate-early gene (IE-1), the 64/65-kDa matrix phosphoprotein (pp65), and the a-seq of the L-S junction region. HCMV laboratory strains Towne, AD169, and Davis were distinguishable, in terms of size of the amplified product, when analyzed by PCR with primers specific for the a-seq but were indistinguishable by using PCR targeted to IE-1 and pp65 sequences. When this technique was applied to a characterization of isolates from individuals with HCMV infection, selected isolates could be readily distinguished. In addition, when the a-seq PCR product was analyzed with restriction enzyme digestion for the presence of specific sequences, these DNA differences were confirmed. PCR analysis across the variable a-seq of HCMV demonstrated differences among strains which were confirmed by RFLP in 38 of 40 isolates analyzed. The most informative restriction enzyme sites in the a-seq for distinguishing HCMV isolates were those of MnlI and BssHII. This indicates that the a-seq of HCMV is heterogeneous among wild strains, and PCR of the a-seq of HCMV is a practical way to characterize differences in strains of HCMV. Images PMID:1980680

  18. Genetic diversity of nifH gene sequences in Paenibacillus azotofixans strains and soil samples analyzed by denaturing gradiënt gel electrophoresis of PCR-amplified gene fragments

    NARCIS (Netherlands)

    Rosado, A.S.; Duarte, G.F.; Seldin, L.; Elsas, van J.D.

    1998-01-01

    The diversity of dinitrogenase reductase gene (nifH) fragments in Paenibacillus azotofixans strains was investigated by using molecular methods. The partial nifH gene sequences of eight P. azotofixans strains, as well as one strain each of the close relatives Paenibacillus durum, Paenibacillus

  19. Critical Features of Fragment Libraries for Protein Structure Prediction.

    Science.gov (United States)

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  20. Mapping of a Novel Race Specific Resistance Gene to Phytophthora Root Rot of Pepper (Capsicum annuum) Using Bulked Segregant Analysis Combined with Specific Length Amplified Fragment Sequencing Strategy.

    Science.gov (United States)

    Xu, Xiaomei; Chao, Juan; Cheng, Xueli; Wang, Rui; Sun, Baojuan; Wang, Hengming; Luo, Shaobo; Xu, Xiaowan; Wu, Tingquan; Li, Ying

    2016-01-01

    Phytophthora root rot caused by Phytophthora capsici (P. capsici) is a serious limitation to pepper production in Southern China, with high temperature and humidity. Mapping PRR resistance genes can provide linked DNA markers for breeding PRR resistant varieties by molecular marker-assisted selection (MAS). Two BC1 populations and an F2 population derived from a cross between P. capsici-resistant accession, Criollo de Morelos 334 (CM334) and P. capsici-susceptible accession, New Mexico Capsicum Accession 10399 (NMCA10399) were used to investigate the genetic characteristics of PRR resistance. PRR resistance to isolate Byl4 (race 3) was controlled by a single dominant gene, PhR10, that was mapped to an interval of 16.39Mb at the end of the long arm of chromosome 10. Integration of bulked segregant analysis (BSA) and Specific Length Amplified Fragment sequencing (SLAF-seq) provided an efficient genetic mapping strategy. Ten polymorphic Simple Sequence Repeat (SSR) markers were found within this region and used to screen the genotypes of 636 BC1 plants, delimiting PhR10 to a 2.57 Mb interval between markers P52-11-21 (1.5 cM away) and P52-11-41 (1.1 cM). A total of 163 genes were annotated within this region and 31 were predicted to be associated with disease resistance. PhR10 is a novel race specific gene for PRR, and this paper describes linked SSR markers suitable for marker-assisted selection of PRR resistant varieties, also laying a foundation for cloning the resistance gene.

  1. Mapping of a Novel Race Specific Resistance Gene to Phytophthora Root Rot of Pepper (Capsicum annuum Using Bulked Segregant Analysis Combined with Specific Length Amplified Fragment Sequencing Strategy.

    Directory of Open Access Journals (Sweden)

    Xiaomei Xu

    Full Text Available Phytophthora root rot caused by Phytophthora capsici (P. capsici is a serious limitation to pepper production in Southern China, with high temperature and humidity. Mapping PRR resistance genes can provide linked DNA markers for breeding PRR resistant varieties by molecular marker-assisted selection (MAS. Two BC1 populations and an F2 population derived from a cross between P. capsici-resistant accession, Criollo de Morelos 334 (CM334 and P. capsici-susceptible accession, New Mexico Capsicum Accession 10399 (NMCA10399 were used to investigate the genetic characteristics of PRR resistance. PRR resistance to isolate Byl4 (race 3 was controlled by a single dominant gene, PhR10, that was mapped to an interval of 16.39Mb at the end of the long arm of chromosome 10. Integration of bulked segregant analysis (BSA and Specific Length Amplified Fragment sequencing (SLAF-seq provided an efficient genetic mapping strategy. Ten polymorphic Simple Sequence Repeat (SSR markers were found within this region and used to screen the genotypes of 636 BC1 plants, delimiting PhR10 to a 2.57 Mb interval between markers P52-11-21 (1.5 cM away and P52-11-41 (1.1 cM. A total of 163 genes were annotated within this region and 31 were predicted to be associated with disease resistance. PhR10 is a novel race specific gene for PRR, and this paper describes linked SSR markers suitable for marker-assisted selection of PRR resistant varieties, also laying a foundation for cloning the resistance gene.

  2. Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution

    KAUST Repository

    Lightfoot, D. J.; Jarvis, David Erwin; Ramaraj, T.; Lee, R.; Jellen, E. N.; Maughan, P. J.

    2017-01-01

    Background: Amaranth (Amaranthus hypochondriacus) was a food staple among the ancient civilizations of Central and South America that has recently received increased attention due to the high nutritional value of the seeds, with the potential to help alleviate malnutrition and food security concerns, particularly in arid and semiarid regions of the developing world. Here, we present a reference-quality assembly of the amaranth genome which will assist the agronomic development of the species.Results: Utilizing single-molecule, real-time sequencing (Pacific Biosciences) and chromatin interaction mapping (Hi-C) to close assembly gaps and scaffold contigs, respectively, we improved our previously reported Illumina-based assembly to produce a chromosome-scale assembly with a scaffold N50 of 24.4 Mb. The 16 largest scaffolds contain 98% of the assembly and likely represent the haploid chromosomes (n = 16). To demonstrate the accuracy and utility of this approach, we produced physical and genetic maps and identified candidate genes for the betalain pigmentation pathway. The chromosome-scale assembly facilitated a genome-wide syntenic comparison of amaranth with other Amaranthaceae species, revealing chromosome loss and fusion events in amaranth that explain the reduction from the ancestral haploid chromosome number (n = 18) for a tetraploid member of the Amaranthaceae. as major evolutionary events in the 2n = 32 amaranths and clearly establish the homoeologous relationship among most of the subgenome chromosomes, which will facilitate future investigations of intragenomic changes that occurred post polyploidization.

  3. Label-free and reagentless electrochemical detection of PCR fragments using self-assembled quinone derivative monolayer: Application to Mycobacterium tuberculosis

    DEFF Research Database (Denmark)

    Zhang, Q D; March, G; Noel, V

    2012-01-01

    We report a signal-on, label-free and reagentless electrochemical DNA biosensor, based on a mixed self-assembled monolayer of thiolated hydroxynaphthoquinone and thiolated oligonucleotide. Electrochemical changes resulting from hybridization were evidenced with oligonucleotide targets (as models...

  4. Use of inter-simple sequence repeats and amplified fragment length polymorphisms to analyze genetic relationships among small grain-infecting species of ustilago.

    Science.gov (United States)

    Menzies, J G; Bakkeren, G; Matheson, F; Procunier, J D; Woods, S

    2003-02-01

    ABSTRACT In the smut fungi, few features are available for use as taxonomic criteria (spore size, shape, morphology, germination type, and host range). DNA-based molecular techniques are useful in expanding the traits considered in determining relationships among these fungi. We examined the phylogenetic relationships among seven species of Ustilago (U. avenae, U. bullata, U. hordei, U. kolleri, U. nigra, U. nuda, and U. tritici) using inter-simple sequence repeats (ISSRs) and amplified fragment length polymorphisms (AFLPs) to compare their DNA profiles. Fifty-four isolates of different Ustilago spp. were analyzed using ISSR primers, and 16 isolates of Ustilago were studied using AFLP primers. The variability among isolates within species was low for all species except U. bullata. The isolates of U. bullata, U. nuda, and U. tritici were well separated and our data supports their speciation. U. avenae and U. kolleri isolates did not separate from each other and there was little variability between these species. U. hordei and U. nigra isolates also showed little variability between species, but the isolates from each species grouped together. Our data suggest that U. avenae and U. kolleri are monophyletic and should be considered one species, as should U. hordei and U. nigra.

  5. The 0.3-kb fragment containing the R-U5-5'leader sequence of Friend murine leukemia virus influences the level of protein expression from spliced mRNA.

    Science.gov (United States)

    Choo, Yeng Cheng; Seki, Yohei; Machinaga, Akihito; Ogita, Nobuo; Takase-Yoden, Sayaka

    2013-04-19

    A neuropathogenic variant of Friend murine leukemia virus (Fr-MLV) clone A8 induces spongiform neurodegeneration when infected into neonatal rats. Studies with chimeras constructed from the A8 virus and the non-neuropathogenic Fr-MLV clone 57 identified a 0.3-kb KpnI-AatII fragment containing a R-U5-5'leader sequence as an important determinant for inducing spongiosis, in addition to the env gene of A8 as the primary determinant. This 0.3-kb fragment contains a 17-nucleotide difference between the A8 and 57 sequences. We previously showed that the 0.3-kb fragment influences expression levels of Env protein in both cultured cells and rat brain, but the corresponding molecular mechanisms are not well understood. Studies with expression vectors constructed from the full-length proviral genome of Fr-MLV that incorporated the luciferase (luc) gene instead of the env gene found that the vector containing the A8-0.3-kb fragment yielded a larger amount of spliced luc-mRNA and showed higher expression of luciferase when compared to the vector containing the 57-0.3-kb fragment. The amount of total transcripts from the vectors, the poly (A) tail length of their mRNAs, and the nuclear-cytoplasm distribution of luc-mRNA in transfected cells were also evaluated. The 0.3-kb fragment did not influence transcription efficiency, mRNA polyadenylation or nuclear export of luc-mRNA. Mutational analyses were carried out to determine the importance of nucleotides that differ between the A8 and 57 sequences within the 0.3-kb fragment. In particular, seven nucleotides upstream of the 5'splice site (5'ss) were found to be important in regulating the level of protein expression from spliced messages. Interestingly, these nucleotides reside within the stem-loop structure that has been speculated to limit the recognition of 5'ss. The 0.3-kb fragment containing the R-U5-5'leader sequence of Fr-MLV influences the level of protein expression from the spliced-mRNA by regulating the splicing

  6. Detection of [O III] at z ∼ 3: A Galaxy Above the Main Sequence, Rapidly Assembling Its Stellar Mass

    Science.gov (United States)

    Vishwas, Amit; Ferkinhoff, Carl; Nikola, Thomas; Parshley, Stephen C.; Schoenwald, Justin P.; Stacey, Gordon J.; Higdon, Sarah J. U.; Higdon, James L.; Weiss, Axel; Güsten, Rolf; Menten, Karl M.

    2018-04-01

    We detect bright emission in the far-infrared (far-IR) fine structure [O III] 88 μm line from a strong lensing candidate galaxy, H-ATLAS J113526.3-014605, hereafter G12v2.43, at z = 3.127, using the second-generation Redshift (z) and Early Universe Spectrometer (ZEUS-2) at the Atacama Pathfinder Experiment Telescope (APEX). This is only the fifth detection of this far-IR line from a submillimeter galaxy at the epoch of galaxy assembly. The observed [O III] luminosity of 7.1 × 109 ≤ft(\\tfrac{10}{μ }\\right) L ⊙ likely arises from H II regions around massive stars, and the amount of Lyman continuum photons required to support the ionization indicate the presence of (1.2–5.2) × 106 ≤ft(\\tfrac{10}{μ }\\right) equivalent O5.5 or higher stars, where μ would be the lensing magnification factor. The observed line luminosity also requires a minimum mass of ∼2 × 108 ≤ft(\\tfrac{10}{μ }\\right) M ⊙ in ionized gas, that is 0.33% of the estimated total molecular gas mass of 6 × 1010 ≤ft(\\tfrac{10}{μ }\\right) M ⊙. We compile multi-band photometry tracing rest-frame ultraviolet to millimeter continuum emission to further constrain the properties of this dusty high-redshift, star-forming galaxy. Via SED modeling we find G12v2.43 is forming stars at a rate of 916 ≤ft(\\tfrac{10}{μ }\\right) M ⊙ yr‑1 and already has a stellar mass of 8 × 1010 ≤ft(\\tfrac{10}{μ }\\right) M ⊙. We also constrain the age of the current starburst to be ≤slant 5 Myr, making G12v2.43 a gas-rich galaxy lying above the star-forming main sequence at z ∼ 3, undergoing a growth spurt, and it could be on the main sequence within the derived gas depletion timescale of ∼66 Myr.

  7. Prediction of the neuropeptidomes of members of the Astacidea (Crustacea, Decapoda) using publicly accessible transcriptome shotgun assembly (TSA) sequence data.

    Science.gov (United States)

    Christie, Andrew E; Chi, Megan

    2015-12-01

    The decapod infraorder Astacidea is comprised of clawed lobsters and freshwater crayfish. Due to their economic importance and their use as models for investigating neurochemical signaling, much work has focused on elucidating their neurochemistry, particularly their peptidergic systems. Interestingly, no astacidean has been the subject of large-scale peptidomic analysis via in silico transcriptome mining, this despite growing transcriptomic resources for members of this taxon. Here, the publicly accessible astacidean transcriptome shotgun assembly data were mined for putative peptide-encoding transcripts; these sequences were used to predict the structures of mature neuropeptides. One hundred seventy-six distinct peptides were predicted for Procambarus clarkii, including isoforms of adipokinetic hormone-corazonin-like peptide (ACP), allatostatin A (AST-A), allatostatin B, allatostatin C (AST-C) bursicon α, bursicon β, CCHamide, crustacean hyperglycemic hormone (CHH)/ion transport peptide (ITP), diuretic hormone 31 (DH31), eclosion hormone (EH), FMRFamide-like peptide, GSEFLamide, intocin, leucokinin, neuroparsin, neuropeptide F, pigment dispersing hormone, pyrokinin, RYamide, short neuropeptide F (sNPF), SIFamide, sulfakinin and tachykinin-related peptide (TRP). Forty-six distinct peptides, including isoforms of AST-A, AST-C, bursicon α, CCHamide, CHH/ITP, DH31, EH, intocin, myosuppressin, neuroparsin, red pigment concentrating hormone, sNPF and TRP, were predicted for Pontastacus leptodactylus, with a bursicon β and a neuroparsin predicted for Cherax quadricarinatus. The identification of ACP is the first from a decapod, while the predictions of CCHamide, EH, GSEFLamide, intocin, neuroparsin and RYamide are firsts for the Astacidea. Collectively, these data greatly expand the catalog of known astacidean neuropeptides and provide a foundation for functional studies of peptidergic signaling in members of this decapod infraorder. Copyright © 2015 Elsevier Inc

  8. Next generation sequencing (NGS)technologies and applications

    Energy Technology Data Exchange (ETDEWEB)

    Vuyisich, Momchilo [Los Alamos National Laboratory

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  9. Jet fragmentation

    International Nuclear Information System (INIS)

    Saxon, D.H.

    1985-10-01

    The paper reviews studies on jet fragmentation. The subject is discussed under the topic headings: fragmentation models, charged particle multiplicity, bose-einstein correlations, identified hadrons in jets, heavy quark fragmentation, baryon production, gluon and quark jets compared, the string effect, and two successful models. (U.K.)

  10. Fragment capture device

    Science.gov (United States)

    Payne, Lloyd R.; Cole, David L.

    2010-03-30

    A fragment capture device for use in explosive containment. The device comprises an assembly of at least two rows of bars positioned to eliminate line-of-sight trajectories between the generation point of fragments and a surrounding containment vessel or asset. The device comprises an array of at least two rows of bars, wherein each row is staggered with respect to the adjacent row, and wherein a lateral dimension of each bar and a relative position of each bar in combination provides blockage of a straight-line passage of a solid fragment through the adjacent rows of bars, wherein a generation point of the solid fragment is located within a cavity at least partially enclosed by the array of bars.

  11. Application of diazene-directed fragment assembly to the total synthesis and stereochemical assignment of (+)-desmethyl-meso-chimonanthine and related heterodimeric alkaloids

    OpenAIRE

    Lathrop, Stephen; Movassaghi, Mohammad

    2013-01-01

    We describe the first application of our methodology for heterodimerization via diazene fragmentation towards the total synthesis of (−)-calycanthidine, meso-chimonanthine, and (+)-desmethyl-meso-chimonanthine. Our syntheses of these alkaloids feature an improved route to C3a-aminocyclotryptamines, an enhanced method for sulfamide synthesis and oxidation, in addition to a late-stage diversification leading to the first enantioselective total synthesis of (+)-desmethyl-meso-chimonanthine and i...

  12. Effect of amino acid sequence and pH on nanofiber formation of self-assembling peptides EAK16-II and EAK16-IV.

    Science.gov (United States)

    Hong, Yooseong; Legge, Raymond L; Zhang, S; Chen, P

    2003-01-01

    Atomic force microscopy (AFM) and axisymmetric drop shape analysis-profile (ASDA-P) were used to investigate the mechanism of self-assembly of peptides. The peptides chosen consisted of 16 alternating hydrophobic and hydrophilic amino acids, where the hydrophilic residues possess alternating negative and positive charges. Two types of peptides, AEAEAKAKAEAEAKAK (EAK16-II) and AEAEAEAEAKAKAKAK (EAK16-IV), were investigated in terms of nanostructure formation through self-assembly. The experimental results, which focused on the effects of the amino acid sequence and pH, show that the nanostructures formed by the peptides are dependent on the amino acid sequence and the pH of the solution. For pH conditions around neutrality, one of the peptides used in this study, EAK16-IV, forms globular assemblies and has lower surface tension at air-water interfaces than another peptide, EAK16-II, which forms fibrillar assemblies at the same pH. When the pH is lowered below 6.5 or raised above 7.5, there is a transition from globular to fibrillar structures for EAK16-IV, but EAK16-II does not show any structural transition. Surface tension measurements using ADSA-P showed different surface activities of peptides at air-water interfaces. EAK16-II does not show a significant difference in surface tension for the pH range between 4 and 9. However, EAK16-IV shows a noticeable decrease in surface tension at pH around neutrality, indicating that the formation of globular assemblies is related to the molecular hydrophobicity.

  13. Getting complete genomes from complex samples using nanopore sequencing

    DEFF Research Database (Denmark)

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Albertsen, Mads

    Background Short read DNA sequencing and metagenomic binning workflows have made it possible to extract bacterial genome bins from environmental microbial samples containing hundreds to thousands of different species. However, these genome bins often do not represent complete genomes......, as they are mostly fragmented, incomplete and often contaminated with foreign DNA. The value of these `draft genomes` have limited, lasting value to the scientific community, as gene synteny is broken and there is some uncertainty of what is missing1. The genetic material most often missed is important multi......-copy and/or conserved marker genes such as the 16S rRNA gene, as sequence micro-heterogeneity prevents assembly of these genes in the de novo assembly. However, long read sequencing technologies are emerging promising an end to fragmented genome assemblies2. Experimental design We extracted DNA from a full...

  14. Partial nucleotide sequences, and routine typing by polymerase chain reaction-restriction fragment length polymorphism, of the brown trout (Salmo trutta) lactate dehydrogenase, LDH-C1*90 and *100 alleles.

    Science.gov (United States)

    McMeel, O M; Hoey, E M; Ferguson, A

    2001-01-01

    The cDNA nucleotide sequences of the lactate dehydrogenase alleles LDH-C1*90 and *100 of brown trout (Salmo trutta) were found to differ at position 308 where an A is present in the *100 allele but a G is present in the *90 allele. This base substitution results in an amino acid change from aspartic acid at position 82 in the LDH-C1 100 allozyme to a glycine in the 90 allozyme. Since aspartic acid has a net negative charge whilst glycine is uncharged, this is consistent with the electrophoretic observation that the LDH-C1 100 allozyme has a more anodal mobility relative to the LDH-C1 90 allozyme. Based on alignment of the cDNA sequence with the mouse genomic sequence, a local primer set was designed, incorporating the variable position, and was found to give very good amplification with brown trout genomic DNA. Sequencing of this fragment confirmed the difference in both homozygous and heterozygous individuals. Digestion of the polymerase chain reaction products with BslI, a restriction enzyme specific for the site difference, gave one, two and three fragments for the two homozygotes and the heterozygote, respectively, following electrophoretic separation. This provides a DNA-based means of routine screening of the highly informative LDH-C1* polymorphism in brown trout population genetic studies. Primer sets presented could be used to sequence cDNA of other LDH* genes of brown trout and other species.

  15. Nuclear fragmentation

    International Nuclear Information System (INIS)

    Chung, K.C.

    1989-01-01

    An introduction to nuclear fragmentation, with emphasis in percolation ideas, is presented. The main theoretical models are discussed and as an application, the uniform expansion approximation is presented and the statistical multifragmentation model is used to calculate the fragment energy spectra. (L.C.)

  16. Detection of a Usp-like gene in Calotropis procera plant from the de novo assembled genome contigs of the high-throughput sequencing dataset

    KAUST Repository

    Shokry, Ahmed M.

    2014-02-01

    The wild plant species Calotropis procera (C. procera) has many potential applications and beneficial uses in medicine, industry and ornamental field. It also represents an excellent source of genes for drought and salt tolerance. Genes encoding proteins that contain the conserved universal stress protein (USP) domain are known to provide organisms like bacteria, archaea, fungi, protozoa and plants with the ability to respond to a plethora of environmental stresses. However, information on the possible occurrence of Usp in C. procera is not available. In this study, we uncovered and characterized a one-class A Usp-like (UspA-like, NCBI accession No. KC954274) gene in this medicinal plant from the de novo assembled genome contigs of the high-throughput sequencing dataset. A number of GenBank accessions for Usp sequences were blasted with the recovered de novo assembled contigs. Homology modelling of the deduced amino acids (NCBI accession No. AGT02387) was further carried out using Swiss-Model, accessible via the EXPASY. Superimposition of C. procera USPA-like full sequence model on Thermus thermophilus USP UniProt protein (PDB accession No. Q5SJV7) was constructed using RasMol and Deep-View programs. The functional domains of the novel USPA-like amino acids sequence were identified from the NCBI conserved domain database (CDD) that provide insights into sequence structure/function relationships, as well as domain models imported from a number of external source databases (Pfam, SMART, COG, PRK, TIGRFAM). © 2014 Académie des sciences.

  17. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    Energy Technology Data Exchange (ETDEWEB)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  18. Extensive error in the number of genes inferred from draft genome assemblies.

    Directory of Open Access Journals (Sweden)

    James F Denton

    2014-12-01

    Full Text Available Current sequencing methods produce large amounts of data, but genome assemblies based on these data are often woefully incomplete. These incomplete and error-filled assemblies result in many annotation errors, especially in the number of genes present in a genome. In this paper we investigate the magnitude of the problem, both in terms of total gene number and the number of copies of genes in specific families. To do this, we compare multiple draft assemblies against higher-quality versions of the same genomes, using several new assemblies of the chicken genome based on both traditional and next-generation sequencing technologies, as well as published draft assemblies of chimpanzee. We find that upwards of 40% of all gene families are inferred to have the wrong number of genes in draft assemblies, and that these incorrect assemblies both add and subtract genes. Using simulated genome assemblies of Drosophila melanogaster, we find that the major cause of increased gene numbers in draft genomes is the fragmentation of genes onto multiple individual contigs. Finally, we demonstrate the usefulness of RNA-Seq in improving the gene annotation of draft assemblies, largely by connecting genes that have been fragmented in the assembly process.

  19. Gene prediction in metagenomic fragments: A large scale machine learning approach

    Directory of Open Access Journals (Sweden)

    Morgenstern Burkhard

    2008-04-01

    Full Text Available Abstract Background Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions. Results We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability. Conclusion Large scale machine learning methods are well-suited for gene

  20. Integration of mate pair sequences to improve shotgun assemblies of flow-sorted chromosome arms of hexaploid wheat

    Czech Academy of Sciences Publication Activity Database

    Belova, T.; Zhan, B.J.; Wright, J.; Caccamo, M.; Asp, T.; Šimková, Hana; Kent, M.; Bendixen, C.; Panitz, F.; Lien, S.; Doležel, Jaroslav; Olsen, O.A.; Sandve, S.R.

    2013-01-01

    Roč. 14, APR 4 2013 (2013) ISSN 1471-2164 R&D Projects: GA ČR(CZ) GAP501/12/2554 Grant - others:GA MŠk(CZ) ED0007/01/01 Program:ED Institutional research plan: CEZ:AV0Z50380511 Keywords : Scaffold * Assembly * Wheat Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 4.041, year: 2013

  1. Generation of human Fab antibody libraries: PCR amplification and assembly of light- and heavy-chain coding sequences.

    Science.gov (United States)

    Andris-Widhopf, Jennifer; Steinberger, Peter; Fuller, Roberta; Rader, Christoph; Barbas, Carlos F

    2011-09-01

    The development of therapeutic antibodies for use in the treatment of human diseases has long been a goal for many researchers in the antibody field. One way to obtain these antibodies is through phage-display libraries constructed from human lymphocytes. This protocol describes the construction of human Fab (fragment antigen binding) antibody libraries. In this method, the individual rearranged heavy- and light-chain variable regions are amplified separately and are linked through a series of overlap polymerase chain reaction (PCR) steps to give the final Fab products that are used for cloning.

  2. Generation of human scFv antibody libraries: PCR amplification and assembly of light- and heavy-chain coding sequences.

    Science.gov (United States)

    Andris-Widhopf, Jennifer; Steinberger, Peter; Fuller, Roberta; Rader, Christoph; Barbas, Carlos F

    2011-09-01

    The development of therapeutic antibodies for use in the treatment of human diseases has long been a goal for many researchers in the antibody field. One way to obtain these antibodies is through phage-display libraries constructed from human lymphocytes. This protocol describes the construction of human scFv (single chain antibody fragment) libraries using a short linker (GGSSRSS) or a long linker (GGSSRSSSSGGGGSGGGG). In this method, the individual rearranged heavy- and light-chain variable regions are amplified separately and are linked through a series of overlap polymerase chain reaction (PCR) steps to give the final scFv products that are used for cloning.

  3. Assembly of the Lactuca sativa, L. cv. Tizian draft genome sequence reveals differences within major resistance complex 1 as compared to the cv. Salinas reference genome.

    Science.gov (United States)

    Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2018-02-10

    Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Characterization of Mycoplasma hyosynoviae strains by amplified fragment length polymorphism analysis, pulsed-field gel electrophoresis and 16S ribosomal DNA sequencing

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, N.F.; Ahrens, Peter

    2002-01-01

    , were investigated by analysis of amplified fragment length polymorphisms of the Bgl II and Mfe I restriction sites and by pulsed-field gel electrophoresis of a Bss HII digest of chromosomal DNA. Both methods allowed unambiguous differentiation of the analysed strains and showed similar discriminatory...

  5. Transcriptome sequencing of different narrow-leafed lupin tissue types provides a comprehensive uni-gene assembly and extensive gene-based molecular markers

    Science.gov (United States)

    Kamphuis, Lars G; Hane, James K; Nelson, Matthew N; Gao, Lingling; Atkins, Craig A; Singh, Karam B

    2015-01-01

    Narrow-leafed lupin (NLL; Lupinus angustifolius L.) is an important grain legume crop that is valuable for sustainable farming and is becoming recognized as a human health food. NLL breeding is directed at improving grain production, disease resistance, drought tolerance and health benefits. However, genetic and genomic studies have been hindered by a lack of extensive genomic resources for the species. Here, the generation, de novo assembly and annotation of transcriptome datasets derived from five different NLL tissue types of the reference accession cv. Tanjil are described. The Tanjil transcriptome was compared to transcriptomes of an early domesticated cv. Unicrop, a wild accession P27255, as well as accession 83A:476, together being the founding parents of two recombinant inbred line (RIL) populations. In silico predictions for transcriptome-derived gene-based length and SNP polymorphic markers were conducted and corroborated using a survey assembly sequence for NLL cv. Tanjil. This yielded extensive indel and SNP polymorphic markers for the two RIL populations. A total of 335 transcriptome-derived markers and 66 BAC-end sequence-derived markers were evaluated, and 275 polymorphic markers were selected to genotype the reference NLL 83A:476 × P27255 RIL population. This significantly improved the completeness, marker density and quality of the reference NLL genetic map. PMID:25060816

  6. De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies.

    Science.gov (United States)

    Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

    2016-01-01

    Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.

  7. Sequencing, De Novo Assembly, and Annotation of the Transcriptome of the Endangered Freshwater Pearl Bivalve, Cristaria plicata, Provides Novel Insights into Functional Genes and Marker Discovery.

    Directory of Open Access Journals (Sweden)

    Bharat Bhusan Patnaik

    Full Text Available The freshwater mussel Cristaria plicata (Bivalvia: Eulamellibranchia: Unionidae, is an economically important species in molluscan aquaculture due to its use in pearl farming. The species have been listed as endangered in South Korea due to the loss of natural habitats caused by anthropogenic activities. The decreasing population and a lack of genomic information on the species is concerning for environmentalists and conservationists. In this study, we conducted a de novo transcriptome sequencing and annotation analysis of C. plicata using Illumina HiSeq 2500 next-generation sequencing (NGS technology, the Trinity assembler, and bioinformatics databases to prepare a sustainable resource for the identification of candidate genes involved in immunity, defense, and reproduction.The C. plicata transcriptome analysis included a total of 286,152,584 raw reads and 281,322,837 clean reads. The de novo assembly identified a total of 453,931 contigs and 374,794 non-redundant unigenes with average lengths of 731.2 and 737.1 bp, respectively. Furthermore, 100% coverage of C. plicata mitochondrial genes within two unigenes supported the quality of the assembler. In total, 84,274 unigenes showed homology to entries in at least one database, and 23,246 unigenes were allocated to one or more Gene Ontology (GO terms. The most prominent GO biological process, cellular component, and molecular function categories (level 2 were cellular process, membrane, and binding, respectively. A total of 4,776 unigenes were mapped to 123 biological pathways in the KEGG database. Based on the GO terms and KEGG annotation, the unigenes were suggested to be involved in immunity, stress responses, sex-determination, and reproduction. A total of 17,251 cDNA simple sequence repeats (cSSRs were identified from 61,141 unigenes (size of >1 kb with the most abundant being dinucleotide repeats.This dataset represents the first transcriptome analysis of the endangered mollusc, C. plicata

  8. Controlled fragmentation

    International Nuclear Information System (INIS)

    Arnold, Werner

    2002-01-01

    Contrary to natural fragmentation, controlled fragmentation offers the possibility to adapt fragment parameters like size and mass to the performance requirements in a very flexible way. Known mechanisms like grooves inside the casing, weaken the structure. This is, however, excluded for applications with high accelerations during launch or piercing requirements for example on a semi armor piercing penetrator. Another method to achieve controlled fragmentation with an additional grid layer is presented with which the required grooves are produced 'just in time' inside the casing during detonation of the high explosive. The process of generating the grooves aided by the grid layer was studied using the hydrocode HULL with respect to varying grid designs and material combinations. Subsequent to this, a large range of these theoretically investigated combinations was contemplated in substantial experimental tests. With an optimised grid design and a suitable material selection, the controlled fragment admits a very flexible adaptation to the set requirements. Additional advantages like the increase of perforation performance or incendiary amplification can be realized with the grid layer

  9. Accurate phylogenetic classification of DNA fragments based onsequence composition

    Energy Technology Data Exchange (ETDEWEB)

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  10. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes

    Czech Academy of Sciences Publication Activity Database

    Staňková, Helena; Hastie, A.; Chan, S.; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, P.; Hayashi, S.; Luo, M.; Batley, J.; Edwards, D.; Doležel, Jaroslav; Šimková, Hana

    2016-01-01

    Roč. 14, č. 7 (2016), s. 1523-1531 ISSN 1467-7644 R&D Projects: GA ČR(CZ) GAP501/12/2554; GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : optical mapping * wheat * sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.443, year: 2016

  11. Chameleon fragmentation

    Energy Technology Data Exchange (ETDEWEB)

    Brax, Philippe [Institut de Physique Théorique, CEA, IPhT, CNRS, URA 2306, F-91191Gif/Yvette Cedex (France); Upadhye, Amol, E-mail: philippe.brax@cea.fr, E-mail: aupadhye@anl.gov [Institute for the Early Universe, Ewha University, International Education, Building #601, 11-1, Daehyun-Dong Seodaemun-Gu, Seoul 120-750 (Korea, Republic of)

    2014-02-01

    A scalar field dark energy candidate could couple to ordinary matter and photons, enabling its detection in laboratory experiments. Here we study the quantum properties of the chameleon field, one such dark energy candidate, in an ''afterglow'' experiment designed to produce, trap, and detect chameleon particles. In particular, we investigate the possible fragmentation of a beam of chameleon particles into multiple particle states due to the highly non-linear interaction terms in the chameleon Lagrangian. Fragmentation could weaken the constraints of an afterglow experiment by reducing the energy of the regenerated photons, but this energy reduction also provides a unique signature which could be detected by a properly-designed experiment. We show that constraints from the CHASE experiment are essentially unaffected by fragmentation for φ{sup 4} and 1/φ potentials, but are weakened for steeper potentials, and we discuss possible future afterglow experiments.

  12. Chameleon fragmentation

    International Nuclear Information System (INIS)

    Brax, Philippe; Upadhye, Amol

    2014-01-01

    A scalar field dark energy candidate could couple to ordinary matter and photons, enabling its detection in laboratory experiments. Here we study the quantum properties of the chameleon field, one such dark energy candidate, in an ''afterglow'' experiment designed to produce, trap, and detect chameleon particles. In particular, we investigate the possible fragmentation of a beam of chameleon particles into multiple particle states due to the highly non-linear interaction terms in the chameleon Lagrangian. Fragmentation could weaken the constraints of an afterglow experiment by reducing the energy of the regenerated photons, but this energy reduction also provides a unique signature which could be detected by a properly-designed experiment. We show that constraints from the CHASE experiment are essentially unaffected by fragmentation for φ 4 and 1/φ potentials, but are weakened for steeper potentials, and we discuss possible future afterglow experiments

  13. De Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from Total DNA Sequences

    Directory of Open Access Journals (Sweden)

    Shairul Izan

    2017-08-01

    Full Text Available Whole Genome Shotgun (WGS sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This re-sequencing approach may select against structural differences between the genomes especially in non-model species for which no close relatives have been sequenced before. The alternative approach is to de novo assemble the chloroplast genome from total genomic DNA sequences. In this study, we used k-mer frequency tables to identify and extract the chloroplast reads from the WGS reads and assemble these using a highly integrated and automated custom pipeline. Our strategy includes steps aimed at optimizing assemblies and filling gaps which are left due to coverage variation in the WGS dataset. We have successfully de novo assembled three complete chloroplast genomes from plant species with a range of nuclear genome sizes to demonstrate the universality of our approach: Solanum lycopersicum (0.9 Gb, Aegilops tauschii (4 Gb and Paphiopedilum henryanum (25 Gb. We also highlight the need to optimize the choice of k and the amount of data used. This new and cost-effective method for de novo short read assembly will facilitate the study of complete chloroplast genomes with more accurate analyses and inferences, especially in non-model plant genomes.

  14. De novo transcriptome sequence assembly and identification of AP2/ERF transcription factor related to abiotic stress in parsley (Petroselinum crispum.

    Directory of Open Access Journals (Sweden)

    Meng-Yao Li

    Full Text Available Parsley is an important biennial Apiaceae species that is widely cultivated as herb, spice, and vegetable. Previous studies on parsley principally focused on its physiological and biochemical properties, including phenolic compound and volatile oil contents. However, little is known about the molecular and genetic properties of parsley. In this study, 23,686,707 high-quality reads were obtained and assembled into 81,852 transcripts and 50,161 unigenes for the first time. Functional annotation showed that 30,516 unigenes had sequence similarity to known genes. In addition, 3,244 putative simple sequence repeats were detected in curly parsley. Finally, 1,569 of the identified unigenes belonged to 58 transcription factor families. Various abiotic stresses have a strong detrimental effect on the yield and quality of parsley. AP2/ERF transcription factors have important functions in plant development, hormonal regulation, and abiotic response. A total of 88 putative AP2/ERF factors were identified from the transcriptome sequence of parsley. Seven AP2/ERF transcription factors were selected in this study to analyze the expression profiles of parsley under different abiotic stresses. Our data provide a potentially valuable resource that can be used for intensive parsley research.

  15. De novo transcriptome sequence assembly and identification of AP2/ERF transcription factor related to abiotic stress in parsley (Petroselinum crispum).

    Science.gov (United States)

    Li, Meng-Yao; Tan, Hua-Wei; Wang, Feng; Jiang, Qian; Xu, Zhi-Sheng; Tian, Chang; Xiong, Ai-Sheng

    2014-01-01

    Parsley is an important biennial Apiaceae species that is widely cultivated as herb, spice, and vegetable. Previous studies on parsley principally focused on its physiological and biochemical properties, including phenolic compound and volatile oil contents. However, little is known about the molecular and genetic properties of parsley. In this study, 23,686,707 high-quality reads were obtained and assembled into 81,852 transcripts and 50,161 unigenes for the first time. Functional annotation showed that 30,516 unigenes had sequence similarity to known genes. In addition, 3,244 putative simple sequence repeats were detected in curly parsley. Finally, 1,569 of the identified unigenes belonged to 58 transcription factor families. Various abiotic stresses have a strong detrimental effect on the yield and quality of parsley. AP2/ERF transcription factors have important functions in plant development, hormonal regulation, and abiotic response. A total of 88 putative AP2/ERF factors were identified from the transcriptome sequence of parsley. Seven AP2/ERF transcription factors were selected in this study to analyze the expression profiles of parsley under different abiotic stresses. Our data provide a potentially valuable resource that can be used for intensive parsley research.

  16. SWAP-Assembler 2: Optimization of De Novo Genome Assembler at Large Scale

    Energy Technology Data Exchange (ETDEWEB)

    Meng, Jintao; Seo, Sangmin; Balaji, Pavan; Wei, Yanjie; Wang, Bingqiang; Feng, Shengzhong

    2016-08-16

    In this paper, we analyze and optimize the most time-consuming steps of the SWAP-Assembler, a parallel genome assembler, so that it can scale to a large number of cores for huge genomes with the size of sequencing data ranging from terabyes to petabytes. According to the performance analysis results, the most time-consuming steps are input parallelization, k-mer graph construction, and graph simplification (edge merging). For the input parallelization, the input data is divided into virtual fragments with nearly equal size, and the start position and end position of each fragment are automatically separated at the beginning of the reads. In k-mer graph construction, in order to improve the communication efficiency, the message size is kept constant between any two processes by proportionally increasing the number of nucleotides to the number of processes in the input parallelization step for each round. The memory usage is also decreased because only a small part of the input data is processed in each round. With graph simplification, the communication protocol reduces the number of communication loops from four to two loops and decreases the idle communication time. The optimized assembler is denoted as SWAP-Assembler 2 (SWAP2). In our experiments using a 1000 Genomes project dataset of 4 terabytes (the largest dataset ever used for assembling) on the supercomputer Mira, the results show that SWAP2 scales to 131,072 cores with an efficiency of 40%. We also compared our work with both the HipMER assembler and the SWAP-Assembler. On the Yanhuang dataset of 300 gigabytes, SWAP2 shows a 3X speedup and 4X better scalability compared with the HipMer assembler and is 45 times faster than the SWAP-Assembler. The SWAP2 software is available at https://sourceforge.net/projects/swapassembler.

  17. Bespoke Fragments

    DEFF Research Database (Denmark)

    Kruse Aagaard, Anders

    2017-01-01

    The PhD project Bespoke Fragments is investigating the space emerging in the exploration of the relationship between digital drawing and fabrication, and the field of materials and their properties and capacities. Through a series of different experiments, the project situates itself in a shuttli...

  18. Rock fragmentation

    Energy Technology Data Exchange (ETDEWEB)

    Brown, W.S.; Green, S.J.; Hakala, W.W.; Hustrulid, W.A.; Maurer, W.C. (eds.)

    1976-01-01

    Experts in rock mechanics, mining, excavation, drilling, tunneling and use of underground space met to discuss the relative merits of a wide variety of rock fragmentation schemes. Information is presented on novel rock fracturing techniques; tunneling using electron beams, thermocorer, electric spark drills, water jets, and diamond drills; and rock fracturing research needs for mining and underground construction. (LCL)

  19. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  20. Immobilization of the enzyme β-lactamase by self-assembly on thin films of a poly(phenyleneethynylene) sequenced with flexible segments containing sulfur atoms

    International Nuclear Information System (INIS)

    Vazquez, Erika; Aguilar, Abdieel Esquivel; Moggio, Ivana; Arias, Eduardo; Romero, Jorge; Barrientos, Hector; Torres, Jose Roman; Luz Reyes Vega, Maria de la

    2007-01-01

    A novel poly(phenyleneethynylene) sequenced in the main conjugated chain with flexible groups containing sulfur atoms has been synthesized by Heck-Sonogashira coupling reaction. Layer-by-layer films of the polymer have been prepared with a linear growth in thickness up to four layers as evidenced by UV-Vis spectroscopy and perfilometry. On the top of these multilayers, the enzyme β-lactamase was deposited by self-assembly. The enzymatic activity was measured by a modified spectrophotometric standard assay method for penicillin G, ampicillin and amoxicillin. A higher and faster activity was obtained for penicillin G and thus preliminary study of the biosensor response by fluorescence was carried out for this antibiotic revealing a decrease in the polymer fluorescence as function of the penicillin G concentration

  1. Immobilization of the enzyme {beta}-lactamase by self-assembly on thin films of a poly(phenyleneethynylene) sequenced with flexible segments containing sulfur atoms

    Energy Technology Data Exchange (ETDEWEB)

    Vazquez, Erika [Centro de Investigacion en Quimica Aplicada (CIQA), Blvd. Enrique Reyna 140, 25253, Saltillo (Mexico); Aguilar, Abdieel Esquivel [Centro de Investigacion en Quimica Aplicada (CIQA), Blvd. Enrique Reyna 140, 25253, Saltillo (Mexico); Facultad de Ciencias Quimicas, Universidad Autonoma de Coahuila, Blvd. V. Carranza and Ing. J. Cardenas, 25000 Saltillo (Mexico); Moggio, Ivana [Centro de Investigacion en Quimica Aplicada (CIQA), Blvd. Enrique Reyna 140, 25253, Saltillo (Mexico)]. E-mail: imoggio@ciqa.mx; Arias, Eduardo [Centro de Investigacion en Quimica Aplicada (CIQA), Blvd. Enrique Reyna 140, 25253, Saltillo (Mexico); Romero, Jorge [Centro de Investigacion en Quimica Aplicada (CIQA), Blvd. Enrique Reyna 140, 25253, Saltillo (Mexico); Barrientos, Hector [Centro de Investigacion en Quimica Aplicada (CIQA), Blvd. Enrique Reyna 140, 25253, Saltillo (Mexico); Torres, Jose Roman [Centro de Investigacion en Quimica Aplicada (CIQA), Blvd. Enrique Reyna 140, 25253, Saltillo (Mexico); Luz Reyes Vega, Maria de la [Facultad de Ciencias Quimicas, Universidad Autonoma de Coahuila, Blvd. V. Carranza and Ing. J. Cardenas, 25000 Saltillo (Mexico)

    2007-05-16

    A novel poly(phenyleneethynylene) sequenced in the main conjugated chain with flexible groups containing sulfur atoms has been synthesized by Heck-Sonogashira coupling reaction. Layer-by-layer films of the polymer have been prepared with a linear growth in thickness up to four layers as evidenced by UV-Vis spectroscopy and perfilometry. On the top of these multilayers, the enzyme {beta}-lactamase was deposited by self-assembly. The enzymatic activity was measured by a modified spectrophotometric standard assay method for penicillin G, ampicillin and amoxicillin. A higher and faster activity was obtained for penicillin G and thus preliminary study of the biosensor response by fluorescence was carried out for this antibiotic revealing a decrease in the polymer fluorescence as function of the penicillin G concentration.

  2. De novo assembly and characterization of the transcriptome of seagrass Zostera marina using Illumina paired-end sequencing.

    Directory of Open Access Journals (Sweden)

    Fanna Kong

    Full Text Available BACKGROUND: The seagrass Zostera marina is a monocotyledonous angiosperm belonging to a polyphyletic group of plants that can live submerged in marine habitats. Zostera marina L. is one of the most common seagrasses and is considered a cornerstone of marine plant molecular ecology research and comparative studies. However, the mechanisms underlying its adaptation to the marine environment still remain poorly understood due to limited transcriptomic and genomic data. PRINCIPAL FINDINGS: Here we explored the transcriptome of Z. marina leaves under different environmental conditions using Illumina paired-end sequencing. Approximately 55 million sequencing reads were obtained, representing 58,457 transcripts that correspond to 24,216 unigenes. A total of 14,389 (59.41% unigenes were annotated by blast searches against the NCBI non-redundant protein database. 45.18% and 46.91% of the unigenes had significant similarity with proteins in the Swiss-Prot database and Pfam database, respectively. Among these, 13,897 unigenes were assigned to 57 Gene Ontology (GO terms and 4,745 unigenes were identified and mapped to 233 pathways via functional annotation against the Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG. We compared the orthologous gene family of the Z. marina transcriptome to Oryza sativa and Pyropia yezoensis and 11,667 orthologous gene families are specific to Z. marina. Furthermore, we identified the photoreceptors sensing red/far-red light and blue light. Also, we identified a large number of genes that are involved in ion transporters and channels including Na+ efflux, K+ uptake, Cl- channels, and H+ pumping. CONCLUSIONS: Our study contains an extensive sequencing and gene-annotation analysis of Z. marina. This information represents a genetic resource for the discovery of genes related to light sensing and salt tolerance in this species. Our transcriptome can be further utilized in future studies on molecular adaptation to

  3. De novo transcriptome sequence assembly from coconut leaves and seeds with a focus on factors involved in RNA-directed DNA methylation.

    Science.gov (United States)

    Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L; Chang, Bill Chia-Han; Matzke, Antonius J M; Matzke, Marjori

    2014-09-04

    Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. Copyright © 2014 Huang et al.

  4. Architectural fragments

    DEFF Research Database (Denmark)

    Bang, Jacob Sebastian

    2018-01-01

    I have created a large collection of plaster models: a collection of Obstructions, errors and opportunities that may develop into architecture. The models are fragments of different complex shapes as well as more simple circular models with different profiling and diameters. In this contect I have....... I try to invent the ways of drawing the models - that decode and unfold them into architectural fragments- into future buildings or constructions in the landscape. [1] Luigi Moretti: Italian architect, 1907 - 1973 [2] Man Ray: American artist, 1890 - 1976. in 2015, I saw the wonderful exhibition...... "Man Ray - Human Equations" at the Glyptotek in Copenhagen, organized by the Philips Collection in Washington D.C. and the Israel Museum in Jerusalem (in 2013). See also: "Man Ray - Human Equations" catalogue published by Hatje Cantz Verlag, Germany, 2014....

  5. Radiation hybrid maps of the D-genome of Aegilops tauschii and their application in sequence assembly of large and complex plant genomes.

    Science.gov (United States)

    Kumar, Ajay; Seetan, Raed; Mergoum, Mohamed; Tiwari, Vijay K; Iqbal, Muhammad J; Wang, Yi; Al-Azzam, Omar; Šimková, Hana; Luo, Ming-Cheng; Dvorak, Jan; Gu, Yong Q; Denton, Anne; Kilian, Andrzej; Lazo, Gerard R; Kianian, Shahryar F

    2015-10-16

    The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high resolution genome maps with saturated marker scaffolds to anchor and orient BAC contigs/ sequence scaffolds for whole genome assembly. Radiation hybrid (RH) mapping has proven to be an excellent tool for the development of such maps for it offers much higher and more uniform marker resolution across the length of the chromosome compared to genetic mapping and does not require marker polymorphism per se, as it is based on presence (retention) vs. absence (deletion) marker assay. In this study, a 178 line RH panel was genotyped with SSRs and DArT markers to develop the first high resolution RH maps of the entire D-genome of Ae. tauschii accession AL8/78. To confirm map order accuracy, the AL8/78-RH maps were compared with:1) a DArT consensus genetic map constructed using more than 100 bi-parental populations, 2) a RH map of the D-genome of reference hexaploid wheat 'Chinese Spring', and 3) two SNP-based genetic maps, one with anchored D-genome BAC contigs and another with anchored D-genome sequence scaffolds. Using marker sequences, the RH maps were also anchored with a BAC contig based physical map and draft sequence of the D-genome of Ae. tauschii. A total of 609 markers were mapped to 503 unique positions on the seven D-genome chromosomes, with a total map length of 14,706.7 cR. The average distance between any two marker loci was 29.2 cR which corresponds to 2.1 cM or 9.8 Mb. The average mapping resolution across the D-genome was estimated to be 0.34 Mb (Mb/cR) or 0.07 cM (cM/cR). The RH maps showed almost perfect agreement with several published maps with regard to chromosome assignments of markers. The mean rank correlations between the position of markers on AL8/78 maps and the four published maps, ranged from 0.75 to 0.92, suggesting a good agreement in marker order. With 609 mapped markers, a total of 2481 deletions for the whole D-genome were detected with an average

  6. Comparing Memory-Efficient Genome Assemblers on Stand-Alone and Cloud Infrastructures

    KAUST Repository

    Kleftogiannis, Dimitrios A.

    2013-09-27

    A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.

  7. Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.

    Science.gov (United States)

    Kleftogiannis, Dimitrios; Kalnis, Panos; Bajic, Vladimir B

    2013-01-01

    A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.

  8. CAPRRESI: Chimera Assembly by Plasmid Recovery and Restriction Enzyme Site Insertion.

    Science.gov (United States)

    Santillán, Orlando; Ramírez-Romero, Miguel A; Dávila, Guillermo

    2017-06-25

    Here, we present chimera assembly by plasmid recovery and restriction enzyme site insertion (CAPRRESI). CAPRRESI benefits from many strengths of the original plasmid recovery method and introduces restriction enzyme digestion to ease DNA ligation reactions (required for chimera assembly). For this protocol, users clone wildtype genes into the same plasmid (pUC18 or pUC19). After the in silico selection of amino acid sequence regions where chimeras should be assembled, users obtain all the synonym DNA sequences that encode them. Ad hoc Perl scripts enable users to determine all synonym DNA sequences. After this step, another Perl script searches for restriction enzyme sites on all synonym DNA sequences. This in silico analysis is also performed using the ampicillin resistance gene (ampR) found on pUC18/19 plasmids. Users design oligonucleotides inside synonym regions to disrupt wildtype and ampR genes by PCR. After obtaining and purifying complementary DNA fragments, restriction enzyme digestion is accomplished. Chimera assembly is achieved by ligating appropriate complementary DNA fragments. pUC18/19 vectors are selected for CAPRRESI because they offer technical advantages, such as small size (2,686 base pairs), high copy number, advantageous sequencing reaction features, and commercial availability. The usage of restriction enzymes for chimera assembly eliminates the need for DNA polymerases yielding blunt-ended products. CAPRRESI is a fast and low-cost method for fusing protein-coding genes.

  9. Sequencing, de novo assembly and characterization of the spotted scat Scatophagus argus (Linnaeus 1766) transcriptome for discovery of reproduction related genes and SSRs

    Science.gov (United States)

    Yang, Wei; Chen, Huapu; Cui, Xuefan; Zhang, Kewei; Jiang, Dongneng; Deng, Siping; Zhu, Chunhua; Li, Guangli

    2017-09-01

    Spotted scat (Scatophagus argus) is an economically important farmed fish, particularly in East and Southeast Asia. Because there has been little research on reproductive development and regulation in this species, the lack of a mature artificial reproduction technology remains a barrier for the sustainable development of the aquaculture industry. More genetic and genomic background knowledge is urgently needed for an in-depth understanding of the molecular mechanism of reproductive process and identification of functional genes related to sexual differentiation, gonad maturation and gametogenesis. For these reasons, we performed transcriptomic analysis on spotted scat using a multiple tissue sample mixing strategy. The Illumina RNA sequencing generated 118 510 486 raw reads. After trimming, de novo assembly was performed and yielded 99 888 unigenes with an average length of 905.75 bp. A total of 45 015 unigenes were successfully annotated to the Nr, Swiss-Prot, KOG and KEGG databases. Additionally, 23 783 and 27 183 annotated unigenes were assigned to 56 Gene Ontology (GO) functional groups and 228 KEGG pathways, respectively. Subsequently, 2 474 transcripts associated with reproduction were selected using GO term and KEGG pathway assignments, and a number of reproduction-related genes involved in sex differentiation, gonad development and gametogenesis were identified. Furthermore, 22 279 simple sequence repeat (SSR) loci were discovered and characterized. The comprehensive transcript dataset described here greatly increases the genetic information available for spotted scat and contributes valuable sequence resources for functional gene mining and analysis. Candidate transcripts involved in reproduction would make good starting points for future studies on reproductive mechanisms, and the putative sex differentiation-related genes will be helpful for sex-determining gene identification and sex-specific marker isolation. Lastly, the SSRs can serve as marker

  10. De novo genome assembly and annotation of Australia's largest freshwater fish, the Murray cod (Maccullochella peelii), from Illumina and Nanopore sequencing read.

    Science.gov (United States)

    Austin, Christopher M; Tan, Mun Hua; Harrisson, Katherine A; Lee, Yin Peng; Croft, Laurence J; Sunnucks, Paul; Pavlova, Alexandra; Gan, Han Ming

    2017-08-01

    One of the most iconic Australian fish is the Murray cod, Maccullochella peelii (Mitchell 1838), a freshwater species that can grow to ∼1.8 metres in length and live to age ≥48 years. The Murray cod is of a conservation concern as a result of strong population contractions, but it is also popular for recreational fishing and is of growing aquaculture interest. In this study, we report the whole genome sequence of the Murray cod to support ongoing population genetics, conservation, and management research, as well as to better understand the evolutionary ecology and history of the species. A draft Murray cod genome of 633 Mbp (N50 = 109 974bp; BUSCO and CEGMA completeness of 94.2% and 91.9%, respectively) with an estimated 148 Mbp of putative repetitive sequences was assembled from the combined sequencing data of 2 fish individuals with an identical maternal lineage; 47.2 Gb of Illumina HiSeq data and 804 Mb of Nanopore data were generated from the first individual while 23.2 Gb of Illumina MiSeq data were generated from the second individual. The inclusion of Nanopore reads for scaffolding followed by subsequent gap-closing using Illumina data led to a 29% reduction in the number of scaffolds and a 55% and 54% increase in the scaffold and contig N50, respectively. We also report the first transcriptome of Murray cod that was subsequently used to annotate the Murray cod genome, leading to the identification of 26 539 protein-coding genes. We present the whole genome of the Murray cod and anticipate this will be a catalyst for a range of genetic, genomic, and phylogenetic studies of the Murray cod and more generally other fish species of the Percichthydae family. © The Authors 2017. Published by Oxford University Press.

  11. MIDAS: A Modular DNA Assembly System for Synthetic Biology.

    Science.gov (United States)

    van Dolleweerd, Craig J; Kessans, Sarah A; Van de Bittner, Kyle C; Bustamante, Leyla Y; Bundela, Rudranuj; Scott, Barry; Nicholson, Matthew J; Parker, Emily J

    2018-04-20

    A modular and hierarchical DNA assembly platform for synthetic biology based on Golden Gate (Type IIS restriction enzyme) cloning is described. This enabling technology, termed MIDAS (for Modular Idempotent DNA Assembly System), can be used to precisely assemble multiple DNA fragments in a single reaction using a standardized assembly design. It can be used to build genes from libraries of sequence-verified, reusable parts and to assemble multiple genes in a single vector, with full user control over gene order and orientation, as well as control of the direction of growth (polarity) of the multigene assembly, a feature that allows genes to be nested between other genes or genetic elements. We describe the detailed design and use of MIDAS, exemplified by the reconstruction, in the filamentous fungus Penicillium paxilli, of the metabolic pathway for production of paspaline and paxilline, key intermediates in the biosynthesis of a range of indole diterpenes-a class of secondary metabolites produced by several species of filamentous fungi. MIDAS was used to efficiently assemble a 25.2 kb plasmid from 21 different modules (seven genes, each composed of three basic parts). By using a parts library-based system for construction of complex assemblies, and a unique set of vectors, MIDAS can provide a flexible route to assembling tailored combinations of genes and other genetic elements, thereby supporting synthetic biology applications in a wide range of expression hosts.

  12. Intermediate Fragment

    DEFF Research Database (Denmark)

    Kruse Aagaard, Anders

    2015-01-01

    This text and its connected exhibition are aiming to reflect both on the thoughts, the processes and the outcome of the design and production of the artefact ‘Intermediate Fragment’ and making as a contemporary architectural tool in general. Intermediate Fragment was made for the exhibition ‘Enga...... of realising an exhibition object was conceived, but expanded, refined and concretised through this process. The context of the work shown here is an interest in a tighter, deeper connection between experimentally obtained material knowledge and architectural design....

  13. Fragmentation based

    Directory of Open Access Journals (Sweden)

    Shashank Srivastava

    2014-01-01

    Gaining the understanding of mobile agent architecture and the security concerns, in this paper, we proposed a security protocol which addresses security with mitigated computational cost. The protocol is a combination of self decryption, co-operation and obfuscation technique. To circumvent the risk of malicious code execution in attacking environment, we have proposed fragmentation based encryption technique. Our encryption technique suits the general mobile agent size and provides hard and thorny obfuscation increasing attacker’s challenge on the same plane providing better performance with respect to computational cost as compared to existing AES encryption.

  14. Bespoke Fragments

    DEFF Research Database (Denmark)

    Kruse Aagaard, Anders

    2016-01-01

    , investigating levels of control and uncertainty encountering with these. Through tangible experiments, the project discusses materiality and digitally controlled fabrications tools as direct expansions of the architect's digital drawing and workflow. The project sees this expansion as an opportunity to connect...... architectural designs, tectonics and aesthetics. In this Ph.D.-project a series a physical, but conceptual, experiment plays the central role in the knowledge production. The experiments result in materialised architectural fragments and tangible experiences. However, these creations also become the driving...

  15. Divide and conquer: enriching environmental sequencing data.

    Directory of Open Access Journals (Sweden)

    Anne Bergeron

    2007-09-01

    Full Text Available In environmental sequencing projects, a mix of DNA from a whole microbial community is fragmented and sequenced, with one of the possible goals being to reconstruct partial or complete genomes of members of the community. In communities with high diversity of species, a significant proportion of the sequences do not overlap any other fragment in the sample. This problem will arise not only in situations with a relatively even distribution of many species, but also when the community in a particular environment is routinely dominated by the same few species. In the former case, no genomes may be assembled at all, while in the latter case a few dominant species in an environment will always be sequenced at high coverage to the detriment of coverage of the greater number of sparse species.Here we show that, with the same global sequencing effort, separating the species into two or more sub-communities prior to sequencing can yield a much higher proportion of sequences that can be assembled. We first use the Lander-Waterman model to show that, if the expected percentage of singleton sequences is higher than 25%, then, under the uniform distribution hypothesis, splitting the community is always a wise choice. We then construct simulated microbial communities to show that the results hold for highly non-uniform distributions. We also show that, for the distributions considered in the experiments, it is possible to estimate quite accurately the relative diversity of the two sub-communities.Given the fact that several methods exist to split microbial communities based on physical properties such as size, density, surface biochemistry, or optical properties, we strongly suggest that groups involved in environmental sequencing, and expecting high diversity, consider splitting their communities in order to maximize the information content of their sequencing effort.

  16. De Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from total DNA Sequences.

    NARCIS (Netherlands)

    Izan, Shairul; Esselink, G.; Visser, R.G.F.; Smulders, M.J.M.; Borm, T.J.A.

    2017-01-01

    Whole Genome Shotgun (WGS) sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This

  17. The catalytic chain of human complement subcomponent C1r. Purification and N-terminal amino acid sequences of the major cyanogen bromide-cleavage fragments.

    Science.gov (United States)

    Arlaud, G J; Gagnon, J; Porter, R R

    1982-01-01

    1. The a- and b-chains of reduced and alkylated human complement subcomponent C1r were separated by high-pressure gel-permeation chromatography and isolated in good yield and in pure form. 2. CNBr cleavage of C1r b-chain yielded eight major peptides, which were purified by gel filtration and high-pressure reversed-phase chromatography. As determined from the sum of their amino acid compositions, these peptides accounted for a minimum molecular weight of 28 000, close to the value 29 100 calculated from the whole b-chain. 3. N-Terminal sequence determinations of C1r b-chain and its CNBr-cleavage peptides allowed the identification of about two-thirds of the amino acids of C1r b-chain. From our results, and on the basis of homology with other serine proteinases, an alignment of the eight CNBr-cleavage peptides from C1r b-chain is proposed. 4. The residues forming the 'charge-relay' system of the active site of serine proteinases (His-57, Asp-102 and Ser-195 in the chymotrypsinogen numbering) are found in the corresponding regions of C1r b-chain, and the amino acid sequence around these residues has been determined. 5. The N-terminal sequence of C1r b-chain has been extended to residue 60 and reveals that C1r b-chain lacks the 'histidine loop', a disulphide bond that is present in all other known serine proteinases.

  18. Cloning and sequencing of wsp encoding gene fragments reveals a diversity of co-infecting Wolbachia strains in Acromyrmex leafcutter ants

    DEFF Research Database (Denmark)

    van Borm, S.; Wenseleers, T.; Billen, J.

    2003-01-01

    Acromyrmex insinuator hosted two additional infections. The multiple Wolbachia strains may influence the expression of reproductive conflicts in leafcutter ants, but the expected turnover of infections may make the cumulative effects on host ant reproduction complex. The additional Wolbachia infections......By sequencing part of the wsp gene of a series of clones, we detected an unusually high diversity of nine Wolbachia strains in queens of three species of leafcutter ants. Up to four strains co-occurred in a single ant. Most strains occurred in two clusters (InvA and InvB), but the social parasite...

  19. Framing Fragmentation

    DEFF Research Database (Denmark)

    Bundgaard, Charlotte

    2009-01-01

    Contemporary industrialized architecture based on advanced information technology and highly technological production processes, implies a radically different approach to architecture than what we have experienced in the past. Works of architecture composed of prefabricated building components......, contain distinctive architectural traits, not only based on rational repetition, but also supporting composition and montage as dynamic concepts. Prefab architecture is an architecture of fragmentation, individualization and changeability, and this sets up new challenges for the architect. This paper...... tries to develop a strategy for the architect dealing with industrially based architecture; a strategy which exploits architectural potentials in industrial building, which recognizes the rules of mass production and which redefines the architect’s position among the agents of building. If recent...

  20. Design strategies for self-assembly of discrete targets

    International Nuclear Information System (INIS)

    Madge, Jim; Miller, Mark A.

    2015-01-01

    Both biological and artificial self-assembly processes can take place by a range of different schemes, from the successive addition of identical building blocks to hierarchical sequences of intermediates, all the way to the fully addressable limit in which each component is unique. In this paper, we introduce an idealized model of cubic particles with patterned faces that allows self-assembly strategies to be compared and tested. We consider a simple octameric target, starting with the minimal requirements for successful self-assembly and comparing the benefits and limitations of more sophisticated hierarchical and addressable schemes. Simulations are performed using a hybrid dynamical Monte Carlo protocol that allows self-assembling clusters to rearrange internally while still providing Stokes-Einstein-like diffusion of aggregates of different sizes. Our simulations explicitly capture the thermodynamic, dynamic, and steric challenges typically faced by self-assembly processes, including competition between multiple partially completed structures. Self-assembly pathways are extracted from the simulation trajectories by a fully extendable scheme for identifying structural fragments, which are then assembled into history diagrams for successfully completed target structures. For the simple target, a one-component assembly scheme is most efficient and robust overall, but hierarchical and addressable strategies can have an advantage under some conditions if high yield is a priority

  1. Phylogeny reconstruction and hybrid analysis of populus (Salicaceae) based on nucleotide sequences of multiple single-copy nuclear genes and plastid fragments.

    Science.gov (United States)

    Wang, Zhaoshan; Du, Shuhui; Dayanandan, Selvadurai; Wang, Dongsheng; Zeng, Yanfei; Zhang, Jianguo

    2014-01-01

    Populus (Salicaceae) is one of the most economically and ecologically important genera of forest trees. The complex reticulate evolution and lack of highly variable orthologous single-copy DNA markers have posed difficulties in resolving the phylogeny of this genus. Based on a large data set of nuclear and plastid DNA sequences, we reconstructed robust phylogeny of Populus using parsimony, maximum likelihood and Bayesian inference methods. The resulting phylogenetic trees showed better resolution at both inter- and intra-sectional level than previous studies. The results revealed that (1) the plastid-based phylogenetic tree resulted in two main clades, suggesting an early divergence of the maternal progenitors of Populus; (2) three advanced sections (Populus, Aigeiros and Tacamahaca) are of hybrid origin; (3) species of the section Tacamahaca could be divided into two major groups based on plastid and nuclear DNA data, suggesting a polyphyletic nature of the section; and (4) many species proved to be of hybrid origin based on the incongruence between plastid and nuclear DNA trees. Reticulate evolution may have played a significant role in the evolution history of Populus by facilitating rapid adaptive radiations into different environments.

  2. Phylogeny reconstruction and hybrid analysis of populus (Salicaceae based on nucleotide sequences of multiple single-copy nuclear genes and plastid fragments.

    Directory of Open Access Journals (Sweden)

    Zhaoshan Wang

    Full Text Available Populus (Salicaceae is one of the most economically and ecologically important genera of forest trees. The complex reticulate evolution and lack of highly variable orthologous single-copy DNA markers have posed difficulties in resolving the phylogeny of this genus. Based on a large data set of nuclear and plastid DNA sequences, we reconstructed robust phylogeny of Populus using parsimony, maximum likelihood and Bayesian inference methods. The resulting phylogenetic trees showed better resolution at both inter- and intra-sectional level than previous studies. The results revealed that (1 the plastid-based phylogenetic tree resulted in two main clades, suggesting an early divergence of the maternal progenitors of Populus; (2 three advanced sections (Populus, Aigeiros and Tacamahaca are of hybrid origin; (3 species of the section Tacamahaca could be divided into two major groups based on plastid and nuclear DNA data, suggesting a polyphyletic nature of the section; and (4 many species proved to be of hybrid origin based on the incongruence between plastid and nuclear DNA trees. Reticulate evolution may have played a significant role in the evolution history of Populus by facilitating rapid adaptive radiations into different environments.

  3. MultiLocus Sequence Analysis- and Amplified Fragment Length Polymorphism-based characterization of xanthomonads associated with bacterial spot of tomato and pepper and their relatedness to Xanthomonas species.

    Science.gov (United States)

    Hamza, A A; Robene-Soustrade, I; Jouen, E; Lefeuvre, P; Chiroleu, F; Fisher-Le Saux, M; Gagnevin, L; Pruvost, O

    2012-05-01

    MultiLocus Sequence Analysis (MLSA) and Amplified Fragment Length Polymorphism (AFLP) were used to measure the genetic relatedness of a comprehensive collection of xanthomonads pathogenic to solaneous hosts to Xanthomonas species. The MLSA scheme was based on partial sequences of four housekeeping genes (atpD, dnaK, efp and gyrB). Globally, MLSA data unambiguously identified strains causing bacterial spot of tomato and pepper at the species level and was consistent with AFLP data. Genetic distances derived from both techniques showed a close relatedness of (i) X. euvesicatoria, X. perforans and X. alfalfae and (ii) X. gardneri and X. cynarae. Maximum likelihood tree topologies derived from each gene portion and the concatenated data set for species in the X. campestris 16S rRNA core (i.e. the species cluster comprising all strains causing bacterial spot of tomato and pepper) were not congruent, consistent with the detection of several putative recombination events in our data sets by several recombination search algorithms. One recombinant region in atpD was identified in most strains of X. euvesicatoria including the type strain. Copyright © 2012 Elsevier GmbH. All rights reserved.

  4. Simple, Low-Cost Detection of Candida parapsilosis Complex Isolates and Molecular Fingerprinting of Candida orthopsilosis Strains in Kuwait by ITS Region Sequencing and Amplified Fragment Length Polymorphism Analysis.

    Science.gov (United States)

    Asadzadeh, Mohammad; Ahmad, Suhail; Hagen, Ferry; Meis, Jacques F; Al-Sweih, Noura; Khan, Ziauddin

    2015-01-01

    Candida parapsilosis has now emerged as the second or third most important cause of healthcare-associated Candida infections. Molecular studies have shown that phenotypically identified C. parapsilosis isolates represent a complex of three species, namely, C. parapsilosis, C. orthopsilosis and C. metapsilosis. Lodderomyces elongisporus is another species phenotypically closely related to the C. parapsilosis-complex. The aim of this study was to develop a simple, low cost multiplex (m) PCR assay for species-specific identification of C. parapsilosis complex isolates and to study genetic relatedness of C. orthopsilosis isolates in Kuwait. Species-specific amplicons from C. parapsilosis (171 bp), C. orthopsilosis (109 bp), C. metapsilosis (217 bp) and L. elongisporus (258 bp) were obtained in mPCR. Clinical isolates identified as C. parapsilosis (n = 380) by Vitek2 in Kuwait and an international collection of 27 C. parapsilosis complex and L. elongisporus isolates previously characterized by rDNA sequencing were analyzed to evaluate mPCR. Species-specific PCR and DNA sequencing of internal transcribed spacer (ITS) region of rDNA were performed to validate the results of mPCR. Fingerprinting of 19 clinical C. orthopsilosis isolates (including 4 isolates from a previous study) was performed by amplified fragment length polymorphism (AFLP) analysis. Phenotypically identified C. parapsilosis isolates (n = 380) were identified as C. parapsilosis sensu stricto (n = 361), C. orthopsilosis (n = 15), C. metapsilosis (n = 1) and L. elongisporus (n = 3) by mPCR. The mPCR also accurately detected all epidemiologically unrelated C. parapsilosis complex and L. elongisporus isolates. The 19 C. orthopsilosis isolates obtained from 16 patients were divided into 3 haplotypes based on ITS region sequence data. Seven distinct genotypes were identified among the 19 C. orthopsilosis isolates by AFLP including a dominant genotype (AFLP1) comprising 11 isolates recovered from 10 patients. A

  5. Structures of endothiapepsin-fragment complexes from crystallographic fragment screening using a novel, diverse and affordable 96-compound fragment library.

    Science.gov (United States)

    Huschmann, Franziska U; Linnik, Janina; Sparta, Karine; Ühlein, Monika; Wang, Xiaojie; Metz, Alexander; Schiebel, Johannes; Heine, Andreas; Klebe, Gerhard; Weiss, Manfred S; Mueller, Uwe

    2016-05-01

    Crystallographic screening of the binding of small organic compounds (termed fragments) to proteins is increasingly important for medicinal chemistry-oriented drug discovery. To enable such experiments in a widespread manner, an affordable 96-compound library has been assembled for fragment screening in both academia and industry. The library is selected from already existing protein-ligand structures and is characterized by a broad ligand diversity, including buffer ingredients, carbohydrates, nucleotides, amino acids, peptide-like fragments and various drug-like organic compounds. When applied to the model protease endothiapepsin in a crystallographic screening experiment, a hit rate of nearly 10% was obtained. In comparison to other fragment libraries and considering that no pre-screening was performed, this hit rate is remarkably high. This demonstrates the general suitability of the selected compounds for an initial fragment-screening campaign. The library composition, experimental considerations and time requirements for a complete crystallographic fragment-screening campaign are discussed as well as the nine fully refined obtained endothiapepsin-fragment structures. While most of the fragments bind close to the catalytic centre of endothiapepsin in poses that have been observed previously, two fragments address new sites on the protein surface. ITC measurements show that the fragments bind to endothiapepsin with millimolar affinity.

  6. Structures of endothiapepsin–fragment complexes from crystallographic fragment screening using a novel, diverse and affordable 96-compound fragment library

    Science.gov (United States)

    Huschmann, Franziska U.; Linnik, Janina; Sparta, Karine; Ühlein, Monika; Wang, Xiaojie; Metz, Alexander; Schiebel, Johannes; Heine, Andreas; Klebe, Gerhard; Weiss, Manfred S.; Mueller, Uwe

    2016-01-01

    Crystallographic screening of the binding of small organic compounds (termed fragments) to proteins is increasingly important for medicinal chemistry-oriented drug discovery. To enable such experiments in a widespread manner, an affordable 96-compound library has been assembled for fragment screening in both academia and industry. The library is selected from already existing protein–ligand structures and is characterized by a broad ligand diversity, including buffer ingredients, carbohydrates, nucleotides, amino acids, peptide-like fragments and various drug-like organic compounds. When applied to the model protease endothiapepsin in a crystallographic screening experiment, a hit rate of nearly 10% was obtained. In comparison to other fragment libraries and considering that no pre-screening was performed, this hit rate is remarkably high. This demonstrates the general suitability of the selected compounds for an initial fragment-screening campaign. The library composition, experimental considerations and time requirements for a complete crystallographic fragment-screening campaign are discussed as well as the nine fully refined obtained endothiapepsin–fragment structures. While most of the fragments bind close to the catalytic centre of endothiapepsin in poses that have been observed previously, two fragments address new sites on the protein surface. ITC measurements show that the fragments bind to endothiapepsin with millimolar affinity. PMID:27139825

  7. Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer

    Directory of Open Access Journals (Sweden)

    Peterlongo Pierre

    2012-03-01

    Full Text Available Abstract Background The analysis of next-generation sequencing data from large genomes is a timely research topic. Sequencers are producing billions of short sequence fragments from newly sequenced organisms. Computational methods for reconstructing whole genomes/transcriptomes (de novo assemblers are typically employed to process such data. However, these methods require large memory resources and computation time. Many basic biological questions could be answered targeting specific information in the reads, thus avoiding complete assembly. Results We present Mapsembler, an iterative micro and targeted assembler which processes large datasets of reads on commodity hardware. Mapsembler checks for the presence of given regions of interest that can be constructed from reads and builds a short assembly around it, either as a plain sequence or as a graph, showing contextual structure. We introduce new algorithms to retrieve approximate occurrences of a sequence from reads and construct an extension graph. Among other results presented in this paper, Mapsembler enabled to retrieve previously described human breast cancer candidate fusion genes, and to detect new ones not previously known. Conclusions Mapsembler is the first software that enables de novo discovery around a region of interest of repeats, SNPs, exon skipping, gene fusion, as well as other structural events, directly from raw sequencing reads. As indexing is localized, the memory footprint of Mapsembler is negligible. Mapsembler is released under the CeCILL license and can be freely downloaded from http://alcovna.genouest.org/mapsembler/.

  8. Molecular diversity of leuconostoc mesenteroides and leuconostoc citreum isolated from traditional french cheeses as revealed by RAPD fingerprinting, 16S rDNA sequencing and 16S rDNA fragment amplification.

    Science.gov (United States)

    Cibik, R; Lepage, E; Talliez, P

    2000-06-01

    For a long time, the identification of the Leuconostoc species has been limited by a lack of accurate biochemical and physiological tests. Here, we use a combination of RAPD, 16S rDNA sequencing, and 16S rDNA fragment amplification with specific primers to classify different leuconostocs at the species and strain level. We analysed the molecular diversity of a collection of 221 strains mainly isolated from traditional French cheeses. The majority of the strains were classified as Leuconostoc mesenteroides (83.7%) or Leuconostoc citreum (14%) using molecular techniques. Despite their presence in French cheeses, the role of L. citreum in traditional technologies has not been determined, probably because of the lack of strain identification criteria. Only one strain of Leuconostoc lactis and Leuconostoc fallax were identified in this collection, and no Weissella paramesenteroides strain was found. However, dextran negative variants of L. mesenteroides, phenotypically misclassified as W. paramesenteroides, were present. The molecular techniques used did not allow us to separate strains of the three L. mesenteroides subspecies (mesenteroides, dextranicum and cremoris). In accordance with previously published results, our findings suggest that these subspecies may be classified as biovars. Correlation found between phenotypes dextranicum and mesenteroides of L. mesenteroides and cheese technology characteristics suggests that certain strains may be better adapted to particular technological environments.

  9. Detection of a Usp-like gene in Calotropis procera plant from the de novo assembled genome contigs of the high-throughput sequencing dataset

    KAUST Repository

    Shokry, Ahmed M.; Al-Karim, Saleh; Ramadan, Ahmed M Ali; Gadallah, Nour; Al-Attas, Sanaa G.; Sabir, Jamal Sabir M; Hassan, Sabah Mohammed; Madkour, Loutfy H.; Bressan, Ray Anthony; Mahfouz, Magdy M.; Bahieldin, Ahmed M.

    2014-01-01

    acids sequence were identified from the NCBI conserved domain database (CDD) that provide insights into sequence structure/function relationships, as well as domain models imported from a number of external source databases (Pfam, SMART, COG, PRK

  10. Impact of genome assembly status on ChIP-Seq and ChIP-PET data mapping

    Directory of Open Access Journals (Sweden)

    Sachs Laurent

    2009-12-01

    Full Text Available Abstract Background ChIP-Seq and ChIP-PET can potentially be used with any genome for genome wide profiling of protein-DNA interaction sites. Unfortunately, it is probable that most genome assemblies will never reach the quality of the human genome assembly. Therefore, it remains to be determined whether ChIP-Seq and ChIP-PET are practicable with genome sequences other than a few (e.g. human and mouse. Findings Here, we used in silico simulations to assess the impact of completeness or fragmentation of genome assemblies on ChIP-Seq and ChIP-PET data mapping. Conclusions Most currently published genome assemblies are suitable for mapping the short sequence tags produced by ChIP-Seq or ChIP-PET.

  11. Assembling large, complex environmental metagenomes

    Energy Technology Data Exchange (ETDEWEB)

    Howe, A. C. [Michigan State Univ., East Lansing, MI (United States). Microbiology and Molecular Genetics, Plant Soil and Microbial Sciences; Jansson, J. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Earth Sciences Division; Malfatti, S. A. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Tringe, S. G. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Tiedje, J. M. [Michigan State Univ., East Lansing, MI (United States). Microbiology and Molecular Genetics, Plant Soil and Microbial Sciences; Brown, C. T. [Michigan State Univ., East Lansing, MI (United States). Microbiology and Molecular Genetics, Computer Science and Engineering

    2012-12-28

    The large volumes of sequencing data required to sample complex environments deeply pose new challenges to sequence analysis approaches. De novo metagenomic assembly effectively reduces the total amount of data to be analyzed but requires significant computational resources. We apply two pre-assembly filtering approaches, digital normalization and partitioning, to make large metagenome assemblies more computationaly tractable. Using a human gut mock community dataset, we demonstrate that these methods result in assemblies nearly identical to assemblies from unprocessed data. We then assemble two large soil metagenomes from matched Iowa corn and native prairie soils. The predicted functional content and phylogenetic origin of the assembled contigs indicate significant taxonomic differences despite similar function. The assembly strategies presented are generic and can be extended to any metagenome; full source code is freely available under a BSD license.

  12. High efficiency hydrodynamic DNA fragmentation in a bubbling system

    NARCIS (Netherlands)

    Li, Lanhui; Jin, Mingliang; Sun, Chenglong; Wang, Xiaoxue; Xie, Shuting; Zhou, Guofu; Van Den Berg, Albert; Eijkel, Jan C.T.; Shui, Lingling

    2017-01-01

    DNA fragmentation down to a precise fragment size is important for biomedical applications, disease determination, gene therapy and shotgun sequencing. In this work, a cheap, easy to operate and high efficiency DNA fragmentation method is demonstrated based on hydrodynamic shearing in a bubbling

  13. One-dimensional TRFLP-SSCP is an effective DNA fingerprinting strategy for soil Archaea that is able to simultaneously differentiate broad taxonomic clades based on terminal fragment length polymorphisms and closely related sequences based on single stranded conformation polymorphisms.

    Science.gov (United States)

    Swanson, Colby A; Sliwinski, Marek K

    2013-09-01

    DNA fingerprinting methods provide a means to rapidly compare microbial assemblages from environmental samples without the need to first cultivate species in the laboratory. The profiles generated by these techniques are able to identify statistically significant temporal and spatial patterns, correlations to environmental gradients, and biological variability to estimate the number of replicates for clone libraries or next generation sequencing (NGS) surveys. Here we describe an improved DNA fingerprinting technique that combines terminal restriction fragment length polymorphisms (TRFLP) and single stranded conformation polymorphisms (SSCP) so that both can be used to profile a sample simultaneously rather than requiring two sequential steps as in traditional two-dimensional (2-D) gel electrophoresis. For the purpose of profiling Archaeal 16S rRNA genes from soil, the dynamic range of this combined 1-D TRFLP-SSCP approach was superior to TRFLP and SSCP. 1-D TRFLP-SSCP was able to distinguish broad taxonomic clades with genetic distances greater than 10%, such as Euryarchaeota and the Thaumarchaeal clades g_Ca. Nitrososphaera (formerly 1.1b) and o_NRP-J (formerly 1.1c) better than SSCP. In addition, 1-D TRFLP-SSCP was able to simultaneously distinguish closely related clades within a genus such as s_SCA1145 and s_SCA1170 better than TRFLP. We also tested the utility of 1-D TRFLP-SSCP fingerprinting of environmental assemblages by comparing this method to the generation of a 16S rRNA clone library of soil Archaea from a restored Tallgrass prairie. This study shows 1-D TRFLP-SSCP fingerprinting provides a rapid and phylogenetically informative screen of Archaeal 16S rRNA genes in soil samples. © 2013.

  14. The heterothallic sugarbeet pathogen Cercospora beticola contains exon fragments of both MAT genes that are homogenized by concerted evolution.

    Science.gov (United States)

    Bolton, Melvin D; de Jonge, Ronnie; Inderbitzin, Patrik; Liu, Zhaohui; Birla, Keshav; Van de Peer, Yves; Subbarao, Krishna V; Thomma, Bart P H J; Secor, Gary A

    2014-01-01

    Dothideomycetes is one of the most ecologically diverse and economically important classes of fungi. Sexual reproduction in this group is governed by mating type (MAT) genes at the MAT1 locus. Self-sterile (heterothallic) species contain one of two genes at MAT1 (MAT1-1-1 or MAT1-2-1) and only isolates of opposite mating type are sexually compatible. In contrast, self-fertile (homothallic) species contain both MAT genes at MAT1. Knowledge of the reproductive capacities of plant pathogens are of particular interest because recombining populations tend to be more difficult to manage in agricultural settings. In this study, we sequenced MAT1 in the heterothallic Dothideomycete fungus Cercospora beticola to gain insight into the reproductive capabilities of this important plant pathogen. In addition to the expected MAT gene at MAT1, each isolate contained fragments of both MAT1-1-1 and MAT1-2-1 at ostensibly random loci across the genome. When MAT fragments from each locus were manually assembled, they reconstituted MAT1-1-1 and MAT1-2-1 exons with high identity, suggesting a retroposition event occurred in a homothallic ancestor in which both MAT genes were fused. The genome sequences of related taxa revealed that MAT gene fragment pattern of Cercospora zeae-maydis was analogous to C. beticola. In contrast, the genome of more distantly related Mycosphaerella graminicola did not contain MAT fragments. Although fragments occurred in syntenic regions of the C. beticola and C. zeae-maydis genomes, each MAT fragment was more closely related to the intact MAT gene of the same species. Taken together, these data suggest MAT genes fragmented after divergence of M. graminicola from the remaining taxa, and concerted evolution functioned to homogenize MAT fragments and MAT genes in each species. Published by Elsevier Inc.

  15. Construction of a 3D-shaped, natural product like fragment library by fragmentation and diversification of natural products.

    Science.gov (United States)

    Prescher, Horst; Koch, Guido; Schuhmann, Tim; Ertl, Peter; Bussenault, Alex; Glick, Meir; Dix, Ina; Petersen, Frank; Lizos, Dimitrios E

    2017-02-01

    A fragment library consisting of 3D-shaped, natural product-like fragments was assembled. Library construction was mainly performed by natural product degradation and natural product diversification reactions and was complemented by the identification of 3D-shaped, natural product like fragments available from commercial sources. In addition, during the course of these studies, novel rearrangements were discovered for Massarigenin C and Cytochalasin E. The obtained fragment library has an excellent 3D-shape and natural product likeness, covering a novel, unexplored and underrepresented chemical space in fragment based drug discovery (FBDD). Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

    Science.gov (United States)

    Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

    2017-01-25

    Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.

  17. Metagenome-Assembled Genome Sequences of Acetobacterium sp. Strain MES1 and Desulfovibrio sp. Strain MES5 from a Cathode-Associated Acetogenic Microbial Community.

    Science.gov (United States)

    Ross, Daniel E; Marshall, Christopher W; May, Harold D; Norman, R Sean

    2017-09-07

    Draft genome sequences of Acetobacterium sp. strain MES1 and Desulfovibrio sp. strain MES5 were obtained from the metagenome of a cathode-associated community enriched within a microbial electrosynthesis system (MES). The draft genome sequences provide insight into the functional potential of these microorganisms within an MES and a foundation for future comparative analyses. Copyright © 2017 Ross et al.

  18. A gene-based high-resolution comparative radiation hybrid map as a framework for genome sequence assembly of a bovine chromosome 6 region associated with QTL for growth, body composition, and milk performance traits

    Directory of Open Access Journals (Sweden)

    Laurent Pascal

    2006-03-01

    Full Text Available Abstract Background A number of different quantitative trait loci (QTL for various phenotypic traits, including milk production, functional, and conformation traits in dairy cattle as well as growth and body composition traits in meat cattle, have been mapped consistently in the middle region of bovine chromosome 6 (BTA6. Dense genetic and physical maps and, ultimately, a fully annotated genome sequence as well as their mutual connections are required to efficiently identify genes and gene variants responsible for genetic variation of phenotypic traits. A comprehensive high-resolution gene-rich map linking densely spaced bovine markers and genes to the annotated human genome sequence is required as a framework to facilitate this approach for the region on BTA6 carrying the QTL. Results Therefore, we constructed a high-resolution radiation hybrid (RH map for the QTL containing chromosomal region of BTA6. This new RH map with a total of 234 loci including 115 genes and ESTs displays a substantial increase in loci density compared to existing physical BTA6 maps. Screening the available bovine genome sequence resources, a total of 73 loci could be assigned to sequence contigs, which were already identified as specific for BTA6. For 43 loci, corresponding sequence contigs, which were not yet placed on the bovine genome assembly, were identified. In addition, the improved potential of this high-resolution RH map for BTA6 with respect to comparative mapping was demonstrated. Mapping a large number of genes on BTA6 and cross-referencing them with map locations in corresponding syntenic multi-species chromosome segments (human, mouse, rat, dog, chicken achieved a refined accurate alignment of conserved segments and evolutionary breakpoints across the species included. Conclusion The gene-anchored high-resolution RH map (1 locus/300 kb for the targeted region of BTA6 presented here will provide a valuable platform to guide high-quality assembling and

  19. Universal elements of fragmentation

    International Nuclear Information System (INIS)

    Yanovsky, V. V.; Tur, A. V.; Kuklina, O. V.

    2010-01-01

    A fragmentation theory is proposed that explains the universal asymptotic behavior of the fragment-size distribution in the large-size range, based on simple physical principles. The basic principles of the theory are the total mass conservation in a fragmentation process and a balance condition for the energy expended in increasing the surface of fragments during their breakup. A flux-based approach is used that makes it possible to supplement the basic principles and develop a minimal theory of fragmentation. Such a supplementary principle is that of decreasing fragment-volume flux with increasing energy expended in fragmentation. It is shown that the behavior of the decreasing flux is directly related to the form of a power-law fragment-size distribution. The minimal theory is used to find universal asymptotic fragment-size distributions and to develop a natural physical classification of fragmentation models. A more general, nonlinear theory of strong fragmentation is also developed. It is demonstrated that solutions to a nonlinear kinetic equation consistent with both basic principles approach a universal asymptotic size distribution. Agreement between the predicted asymptotic fragment-size distributions and experimental observations is discussed.

  20. Whole genome sequencing and assembly of Eukaryotic microbes isolated from ISS environmental surface Kirovograd region soil Chernobyl Nuclear Power Plant and Chernobyl Exclusion Zone

    Data.gov (United States)

    National Aeronautics and Space Administration — The whole-genome sequences of eight fungal strains that were selected for exposure to microgravity at the International Space Station are presented here. These...

  1. Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly

    Directory of Open Access Journals (Sweden)

    Shultz Jeffry

    2008-07-01

    Full Text Available Abstract Background Many of the world's most important food crops have either polyploid genomes or homeologous regions derived from segmental shuffling following polyploid formation. The soybean (Glycine max genome has been shown to be composed of approximately four thousand short interspersed homeologous regions with 1, 2 or 4 copies per haploid genome by RFLP analysis, microsatellite anchors to BACs and by contigs formed from BAC fingerprints. Despite these similar regions,, the genome has been sequenced by whole genome shotgun sequence (WGS. Here the aim was to use BAC end sequences (BES derived from three minimum tile paths (MTP to examine the extent and homogeneity of polyploid-like regions within contigs and the extent of correlation between the polyploid-like regions inferred from fingerprinting and the polyploid-like sequences inferred from WGS matches. Results Results show that when sequence divergence was 1–10%, the copy number of homeologous regions could be identified from sequence variation in WGS reads overlapping BES. Homeolog sequence variants (HSVs were single nucleotide polymorphisms (SNPs; 89% and single nucleotide indels (SNIs 10%. Larger indels were rare but present (1%. Simulations that had predicted fingerprints of homeologous regions could be separated when divergence exceeded 2% were shown to be false. We show that a 5–10% sequence divergence is necessary to separate homeologs by fingerprinting. BES compared to WGS traces showed polyploid-like regions with less than 1% sequence divergence exist at 2.3% of the locations assayed. Conclusion The use of HSVs like SNPs and SNIs to characterize BACs wil improve contig building methods. The implications for bioinformatic and functional annotation of polyploid and paleopolyploid genomes show that a combined approach of BAC fingerprint based physical maps, WGS sequence and HSV-based partitioning of BAC clones from homeologous regions to separate contigs will allow reliable de

  2. Assembly of the PLT device

    International Nuclear Information System (INIS)

    Marino, R.

    1975-11-01

    The assembly of the PLT device began in June 1974 with a preassembly of the mechanical structure at a remote site. The preassembly sequence incorporated final fabrication procedures with an initial staging operation. This successful staging/fabrication procedure proved to be an invaluable asset when the final assembly was started in August 1974. The assembly continued with the initial reassembly of the previously tested structural components at the final machine site. Construction was interrupted at several points to allow for toroidal field coil, vacuum vessel, and poloidal coil installation. Two phases of toroidal field coil power tests were included in the assembly sequence prior to, and just after the vacuum vessel insertion

  3. Next-generation transcriptome assembly

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey A.; Wang, Zhong

    2011-09-01

    Transcriptomics studies often rely on partial reference transcriptomes that fail to capture the full catalog of transcripts and their variations. Recent advances in sequencing technologies and assembly algorithms have facilitated the reconstruction of the entire transcriptome by deep RNA sequencing (RNA-seq), even without a reference genome. However, transcriptome assembly from billions of RNA-seq reads, which are often very short, poses a significant informatics challenge. This Review summarizes the recent developments in transcriptome assembly approaches - reference-based, de novo and combined strategies-along with some perspectives on transcriptome assembly in the near future.

  4. THE REST-FRAME OPTICAL LUMINOSITY FUNCTION OF CLUSTER GALAXIES AT z < 0.8 AND THE ASSEMBLY OF THE CLUSTER RED SEQUENCE

    International Nuclear Information System (INIS)

    Rudnick, Gregory; Von der Linden, Anja; De Lucia, Gabriella; White, Simon; Pello, Roser; Aragon-Salamanca, Alfonso; Marchesini, Danilo; Clowe, Douglas; Halliday, Claire; Jablonka, Pascale; Milvang-Jensen, Bo; Poggianti, Bianca; Saglia, Roberto; Simard, Luc; Zaritsky, Dennis

    2009-01-01

    We present the rest-frame optical luminosity function (LF) of red-sequence galaxies in 16 clusters at 0.4 < z < 0.8 drawn from the ESO Distant Cluster Survey (EDisCS). We compare our clusters to an analogous sample from the Sloan Digital Sky Survey (SDSS) and match the EDisCS clusters to their most likely descendants. We measure all LFs down to M ∼ M * + (2.5-3.5). At z < 0.8, the bright end of the LF is consistent with passive evolution but there is a significant buildup of the faint end of the red sequence toward lower redshift. There is a weak dependence of the LF on cluster velocity dispersion for EDisCS but no such dependence for the SDSS clusters. We find tentative evidence that red-sequence galaxies brighter than a threshold magnitude are already in place, and that this threshold evolves to fainter magnitudes toward lower redshifts. We compare the EDisCS LFs with the LF of coeval red-sequence galaxies in the field and find that the bright end of the LFs agree. However, relative to the number of bright red galaxies, the field has more faint red galaxies than clusters at 0.6 < z < 0.8 but fewer at 0.4 < z < 0.6, implying differential evolution. We compare the total light in the EDisCS cluster red sequences to the total red-sequence light in our SDSS cluster sample. Clusters at 0.4 < z < 0.8 must increase their luminosity on the red sequence (and therefore stellar mass in red galaxies) by a factor of 1-3 by z = 0. The necessary processes that add mass to the red sequence in clusters predict local clusters that are overluminous as compared to those observed in the SDSS. The predicted cluster luminosities can be reconciled with observed local cluster luminosities by combining multiple previously known effects.

  5. Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution

    KAUST Repository

    Lightfoot, D. J.

    2017-08-29

    Background: Amaranth (Amaranthus hypochondriacus) was a food staple among the ancient civilizations of Central and South America that has recently received increased attention due to the high nutritional value of the seeds, with the potential to help alleviate malnutrition and food security concerns, particularly in arid and semiarid regions of the developing world. Here, we present a reference-quality assembly of the amaranth genome which will assist the agronomic development of the species.

  6. De Novo Transcriptome Sequence Assembly and Identification of AP2/ERF Transcription Factor Related to Abiotic Stress in Parsley (Petroselinum crispum)

    OpenAIRE

    Li, Meng-Yao; Tan, Hua-Wei; Wang, Feng; Jiang, Qian; Xu, Zhi-Sheng; Tian, Chang; Xiong, Ai-Sheng

    2014-01-01

    Parsley is an important biennial Apiaceae species that is widely cultivated as herb, spice, and vegetable. Previous studies on parsley principally focused on its physiological and biochemical properties, including phenolic compound and volatile oil contents. However, little is known about the molecular and genetic properties of parsley. In this study, 23,686,707 high-quality reads were obtained and assembled into 81,852 transcripts and 50,161 unigenes for the first time. Functional annotation...

  7. Using nanopore sequencing to get complete genomes from complex samples

    DEFF Research Database (Denmark)

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Nielsen, Per Halkjær

    The advantages of “next generation sequencing” has come at the cost of genome finishing. The dominant sequencing technology provides short reads of 150-300 bp, which has made genome assembly very difficult as the reads do not span important repeat regions. Genomes have thus been added...... to the databases as fragmented assemblies and not as finished contigs that resemble the chromosomes in which the DNA is organised within the cells. This is especially troublesome for genomes derived from complex metagenome sequencing. Databases with incomplete genomes can lead to false conclusions about...... the absence of genes and functional predictions of the organisms. Furthermore, it is common that repetitive elements and marker genes such as the 16S rRNA gene are missing completely from these genome bins. Using nanopore long reads, we demonstrate that it is possible to span these regions and make complete...

  8. Getting complete genomes from complex samples using nanopore sequencing

    DEFF Research Database (Denmark)

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Albertsen, Mads

    Short read sequencing and metagenomic binning workflows have made it possible to extract bacterial genome bins from environmental microbial samples containing hundreds to thousands of different species. However, these genome bins often do not represent complete genomes, as they are mostly...... fragmented, incomplete and often contaminated with foreign DNA and with no robust strategies to validate the quality. The value of these `draft genomes` have limited, lasting value to the scientific community, as gene synteny is broken and the uncertainty of what is missing. The genetic material most often...... missed is important multi-copy and/or conserved marker genes such as the 16S rRNA gene, as sequence micro-heterogeneity prevents assembly of these genes in the de novo assembly. We demonstrate that using nanopore long reads it is now possible to overcome these issues and make complete genomes from...

  9. Biomedical Applications of Self-Assembling Peptides

    NARCIS (Netherlands)

    Radmalekshahi, Mazda; Lempsink, Ludwijn; Amidi, Maryam; Hennink, Wim E.; Mastrobattista, Enrico

    2016-01-01

    Self-assembling peptides have gained increasing attention as versatile molecules to generate diverse supramolecular structures with tunable functionality. Because of the possibility to integrate a wide range of functional domains into self-assembling peptides including cell attachment sequences,

  10. Universality of fragment shapes.

    Science.gov (United States)

    Domokos, Gábor; Kun, Ferenc; Sipos, András Árpád; Szabó, Tímea

    2015-03-16

    The shape of fragments generated by the breakup of solids is central to a wide variety of problems ranging from the geomorphic evolution of boulders to the accumulation of space debris orbiting Earth. Although the statistics of the mass of fragments has been found to show a universal scaling behavior, the comprehensive characterization of fragment shapes still remained a fundamental challenge. We performed a thorough experimental study of the problem fragmenting various types of materials by slowly proceeding weathering and by rapid breakup due to explosion and hammering. We demonstrate that the shape of fragments obeys an astonishing universality having the same generic evolution with the fragment size irrespective of materials details and loading conditions. There exists a cutoff size below which fragments have an isotropic shape, however, as the size increases an exponential convergence is obtained to a unique elongated form. We show that a discrete stochastic model of fragmentation reproduces both the size and shape of fragments tuning only a single parameter which strengthens the general validity of the scaling laws. The dependence of the probability of the crack plan orientation on the linear extension of fragments proved to be essential for the shape selection mechanism.

  11. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jacob

    2007-01-01

    public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. RESULTS: Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which...... with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression...

  12. Exploring the tertiary gene pool of bread wheat: sequence assembly and analysis of chromosome 5M(g) of Aegilops geniculata

    Czech Academy of Sciences Publication Activity Database

    Tiwari, V.K.; Wang, S.C.; Danilova, T.; Koo, D.H.; Vrána, Jan; Kubaláková, Marie; Hřibová, Eva; Rawat, N.; Kalia, B.; Singh, N.; Friebe, B.; Doležel, Jaroslav; Akhunov, E.; Poland, J.; Sabir, J.S.M.; Gill, B.S.

    2015-01-01

    Roč. 84, č. 4 (2015), s. 733-746 ISSN 0960-7412 R&D Projects: GA MŠk(CZ) LO1204; GA ČR GBP501/12/G090 Institutional support: RVO:61389030 Keywords : flow sorting * SNPs * next generation sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 5.468, year: 2015

  13. Investigation of proposed process sequence for the array automated assembly task. Phase I and II. Final report, October 1, 1977-June 30, 1980

    Energy Technology Data Exchange (ETDEWEB)

    Mardesich, N.; Garcia, A.; Eskenas, K.

    1980-08-01

    A selected process sequence for the low cost fabrication of photovoltaic modules was defined during this contract. Each part of the process sequence was looked at regarding its contribution to the overall dollars per watt cost. During the course of the research done, some of the initially included processes were dropped due to technological deficiencies. The printed dielectric diffusion mask, codiffusion of the n+ and p+ regions, wraparound front contacts and retention of the diffusion oxide for use as an AR coating were all the processes that were removed for this reason. Other process steps were retained to achieve the desired overall cost and efficiency. Square wafers, a polymeric spin-on PX-10 diffusion source, a p+ back surface field and silver front contacts are all processes that have been recommended for use in this program. The printed silver solderable pad for making contact to the aluminum back was replaced by an ultrasonically applied tin-zinc pad. Also, the texturized front surface was dropped as inappropriate for the sheet silicon likely to be available in 1986. Progress has also been made on the process sequence for module fabrication. A shift from bonding with a conformal coating to laminating with ethylene vinyl acetate and a glass superstrate is recommended for further module fabrication. The finalized process sequence is described.

  14. Anomalous nuclear fragments

    International Nuclear Information System (INIS)

    Karmanov, V.A.

    1983-01-01

    Experimental data are given, the status of anomalon problem is discussed, theoretical approaches to this problem are outlined. Anomalons are exotic objects formed following fragmentation of nuclei-targets under the effect of nuclei - a beam at the energy of several GeV/nucleon. These nuclear fragments have an anomalously large cross section of interaction and respectively, small free path, considerably shorter than primary nuclei have. The experimental daa are obtained in accelerators following irradiation of nuclear emulsions by 16 O, 56 Fe, 40 Ar beams, as well as propane by 12 C beams. The experimental data testify to dependence of fragment free path on the distance L from the point of the fragment formation. A decrease in the fragment free path is established more reliably than its dependence on L. The problem of the anomalon existence cannot be yet considered resolved. Theoretical models suggested for explanation of anomalously large cross sections of nuclear fragment interaction are variable and rather speculative

  15. Completion of autobuilt protein models using a database of protein fragments

    International Nuclear Information System (INIS)

    Cowtan, Kevin

    2012-01-01

    Two developments in the process of automated protein model building in the Buccaneer software are described: the use of a database of protein fragments in improving the model completeness and the assembly of disconnected chain fragments into complete molecules. Two developments in the process of automated protein model building in the Buccaneer software are presented. A general-purpose library for protein fragments of arbitrary size is described, with a highly optimized search method allowing the use of a larger database than in previous work. The problem of assembling an autobuilt model into complete chains is discussed. This involves the assembly of disconnected chain fragments into complete molecules and the use of the database of protein fragments in improving the model completeness. Assembly of fragments into molecules is a standard step in existing model-building software, but the methods have not received detailed discussion in the literature

  16. Lageos assembly operation plan

    Science.gov (United States)

    Brueger, J.

    1975-01-01

    Guidelines and constraints procedures for LAGEOS assembly, operation, and design performance are given. Special attention was given to thermal, optical, and dynamic analysis and testing. The operation procedures illustrate the interrelation and sequence of tasks in a flow diagram. The diagram also includes quality assurance functions for verification of operation tasks.

  17. Deep sequencing as a method of typing bluetongue virus isolates.

    Science.gov (United States)

    Rao, Pavuluri Panduranga; Reddy, Yella Narasimha; Ganesh, Kapila; Nair, Shreeja G; Niranjan, Vidya; Hegde, Nagendra R

    2013-11-01

    Bluetongue (BT) is an economically important endemic disease of livestock in tropics and subtropics. In addition, its recent spread to temperate regions like North America and Northern Europe is of serious concern. Rapid serotyping and characterization of BT virus (BTV) is an essential step in the identification of origin of the virus and for controlling the disease. Serotyping of BTV is typically performed by serum neutralization, and of late by nucleotide sequencing. This report describes the near complete genome sequencing and typing of two isolates of BTV using Illumina next generation sequencing platform. Two of the BTV RNAs were multiplexed with ten other unknown samples. Viral RNA was isolated and fragmented, reverse transcribed, the cDNA ends were repaired and ligated with a multiplex oligo. The genome library was amplified using primers complementary to the ligated oligo and subjected to single and paired end sequencing. The raw reads were assembled using a de novo method and reference-based assembly was performed based on the contig data. Near complete sequences of all segments of BTV were obtained with more than 20× coverage, and single read sequencing method was sufficient to identify the genotype and serotype of the virus. The two viruses used in this study were typed as BTV-1 and BTV-9E. Copyright © 2013 Elsevier B.V. All rights reserved.

  18. Combining independent de novo assemblies optimizes the coding transcriptome for nonconventional model eukaryotic organisms.

    Science.gov (United States)

    Cerveau, Nicolas; Jackson, Daniel J

    2016-12-09

    Next-generation sequencing (NGS) technologies are arguably the most revolutionary technical development to join the list of tools available to molecular biologists since PCR. For researchers working with nonconventional model organisms one major problem with the currently dominant NGS platform (Illumina) stems from the obligatory fragmentation of nucleic acid material that occurs prior to sequencing during library preparation. This step creates a significant bioinformatic challenge for accurate de novo assembly of novel transcriptome data. This challenge becomes apparent when a variety of modern assembly tools (of which there is no shortage) are applied to the same raw NGS dataset. With the same assembly parameters these tools can generate markedly different assembly outputs. In this study we present an approach that generates an optimized consensus de novo assembly of eukaryotic coding transcriptomes. This approach does not represent a new assembler, rather it combines the outputs of a variety of established assembly packages, and removes redundancy via a series of clustering steps. We test and validate our approach using Illumina datasets from six phylogenetically diverse eukaryotes (three metazoans, two plants and a yeast) and two simulated datasets derived from metazoan reference genome annotations. All of these datasets were assembled using three currently popular assembly packages (CLC, Trinity and IDBA-tran). In addition, we experimentally demonstrate that transcripts unique to one particular assembly package are likely to be bioinformatic artefacts. For all eight datasets our pipeline generates more concise transcriptomes that in fact possess more unique annotatable protein domains than any of the three individual assemblers we employed. Another measure of assembly completeness (using the purpose built BUSCO databases) also confirmed that our approach yields more information. Our approach yields coding transcriptome assemblies that are more likely to be

  19. Fission fragment angular momentum

    International Nuclear Information System (INIS)

    Frenne, D. De

    1991-01-01

    Most of the energy released in fission is converted into translational kinetic energy of the fragments. The remaining excitation energy will be distributed among neutrons and gammas. An important parameter characterizing the scission configuration is the primary angular momentum of the nascent fragments. Neutron emission is not expected to decrease the spin of the fragments by more than one unit of angular momentum and is as such of less importance in the determination of the initial fragment spins. Gamma emission is a suitable tool in studying initial fragment spins because the emission time, number, energy, and multipolarity of the gammas strongly depend on the value of the primary angular momentum. The main conclusions of experiments on gamma emission were that the initial angular momentum of the fragments is large compared to the ground state spin and oriented perpendicular to the fission axis. Most of the recent information concerning initial fragment spin distributions comes from the measurement of isomeric ratios for isomeric pairs produced in fission. Although in nearly every mass chain isomers are known, only a small number are suitable for initial fission fragment spin studies. Yield and half-life considerations strongly limit the number of candidates. This has the advantage that the behavior of a specific isomeric pair can be investigated for a number of fissioning systems at different excitation energies of the fragments and fissioning nuclei. Because most of the recent information on primary angular momenta comes from measurements of isomeric ratios, the global deexcitation process of the fragments and the calculation of the initial fragment spin distribution from measured isomeric ratios are discussed here. The most important results on primary angular momentum determinations are reviewed and some theoretical approaches are given. 45 refs., 7 figs., 2 tabs

  20. Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing

    Directory of Open Access Journals (Sweden)

    Yu-Chih Tsai

    2016-02-01

    Full Text Available Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation.

  1. Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing

    Science.gov (United States)

    Tsai, Yu-Chih; Deming, Clayton; Segre, Julia A.; Kong, Heidi H.; Korlach, Jonas

    2016-01-01

    ABSTRACT Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT) sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation. PMID:26861018

  2. Fuel assemblies

    International Nuclear Information System (INIS)

    Nakatsuka, Masafumi.

    1979-01-01

    Purpose: To prevent scattering of gaseous fission products released from fuel assemblies stored in an fbr type reactor. Constitution; A cap provided with means capable of storing gas is adapted to amount to the assembly handling head, for example, by way of threading in a storage rack of spent fuel assemblies consisting of a bottom plate, a top plate and an assembly support mechanism. By previously eliminating the gas inside of the assembly and the cap in the storage rack, gaseous fission products upon loading, if released from fuel rods during storage, are stored in the cap and do not scatter in the storage rack. (Horiuchi, T.)

  3. Detection of a putative novel adenovirus by PCR amplification, sequencing and phylogenetic characterisation of two gene fragments from formalin-fixed paraffin-embedded tissues of a cat diagnosed with disseminated adenovirus disease.

    Science.gov (United States)

    Lakatos, Béla; Hornyák, Ákos; Demeter, Zoltán; Forgách, Petra; Kennedy, Frances; Rusvai, Miklós

    2017-12-01

    Adenoviral nucleic acid was detected by polymerase chain reaction (PCR) in formalin-fixed paraffin-embedded tissue samples of a cat that had suffered from disseminated adenovirus infection. The identity of the amplified products from the hexon and DNA-dependent DNA polymerase genes was confirmed by DNA sequencing. The sequences were clearly distinguishable from corresponding hexon and polymerase sequences of other mastadenoviruses, including human adenoviruses. These results suggest the possible existence of a distinct feline adenovirus.

  4. Technology for assembling and welding of top and bottom nozzles in fuel assembly

    International Nuclear Information System (INIS)

    Xia Chenglie; Wan Longfu

    1989-10-01

    The construction character, technology and sequence of assembling and welding, assembling jig used for preventing from deformation, and acceptance test of welding technology for top and bottom nozzles are presented

  5. A base composition analysis of natural patterns for the preprocessing of metagenome sequences.

    Science.gov (United States)

    Bonham-Carter, Oliver; Ali, Hesham; Bastola, Dhundy

    2013-01-01

    On the pretext that sequence reads and contigs often exhibit the same kinds of base usage that is also observed in the sequences from which they are derived, we offer a base composition analysis tool. Our tool uses these natural patterns to determine relatedness across sequence data. We introduce spectrum sets (sets of motifs) which are permutations of bacterial restriction sites and the base composition analysis framework to measure their proportional content in sequence data. We suggest that this framework will increase the efficiency during the pre-processing stages of metagenome sequencing and assembly projects. Our method is able to differentiate organisms and their reads or contigs. The framework shows how to successfully determine the relatedness between these reads or contigs by comparison of base composition. In particular, we show that two types of organismal-sequence data are fundamentally different by analyzing their spectrum set motif proportions (coverage). By the application of one of the four possible spectrum sets, encompassing all known restriction sites, we provide the evidence to claim that each set has a different ability to differentiate sequence data. Furthermore, we show that the spectrum set selection having relevance to one organism, but not to the others of the data set, will greatly improve performance of sequence differentiation even if the fragment size of the read, contig or sequence is not lengthy. We show the proof of concept of our method by its application to ten trials of two or three freshly selected sequence fragments (reads and contigs) for each experiment across the six organisms of our set. Here we describe a novel and computationally effective pre-processing step for metagenome sequencing and assembly tasks. Furthermore, our base composition method has applications in phylogeny where it can be used to infer evolutionary distances between organisms based on the notion that related organisms often have much conserved code.

  6. Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics.

    Science.gov (United States)

    Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P

    2010-11-01

    Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.

  7. Human Assisted Assembly Processes

    Energy Technology Data Exchange (ETDEWEB)

    CALTON,TERRI L.; PETERS,RALPH R.

    2000-01-01

    Automatic assembly sequencing and visualization tools are valuable in determining the best assembly sequences, but without Human Factors and Figure Models (HFFMs) it is difficult to evaluate or visualize human interaction. In industry, accelerating technological advances and shorter market windows have forced companies to turn to an agile manufacturing paradigm. This trend has promoted computerized automation of product design and manufacturing processes, such as automated assembly planning. However, all automated assembly planning software tools assume that the individual components fly into their assembled configuration and generate what appear to be a perfectly valid operations, but in reality the operations cannot physically be carried out by a human. Similarly, human figure modeling algorithms may indicate that assembly operations are not feasible and consequently force design modifications; however, if they had the capability to quickly generate alternative assembly sequences, they might have identified a feasible solution. To solve this problem HFFMs must be integrated with automated assembly planning to allow engineers to verify that assembly operations are possible and to see ways to make the designs even better. Factories will very likely put humans and robots together in cooperative environments to meet the demands for customized products, for purposes including robotic and automated assembly. For robots to work harmoniously within an integrated environment with humans the robots must have cooperative operational skills. For example, in a human only environment, humans may tolerate collisions with one another if they did not cause much pain. This level of tolerance may or may not apply to robot-human environments. Humans expect that robots will be able to operate and navigate in their environments without collisions or interference. The ability to accomplish this is linked to the sensing capabilities available. Current work in the field of cooperative

  8. A versatile system for USER cloning-based assembly of expression vectors for mammalian cell engineering.

    Directory of Open Access Journals (Sweden)

    Anne Mathilde Lund

    Full Text Available A new versatile mammalian vector system for protein production, cell biology analyses, and cell factory engineering was developed. The vector system applies the ligation-free uracil-excision based technique--USER cloning--to rapidly construct mammalian expression vectors of multiple DNA fragments and with maximum flexibility, both for choice of vector backbone and cargo. The vector system includes a set of basic vectors and a toolbox containing a multitude of DNA building blocks including promoters, terminators, selectable marker- and reporter genes, and sequences encoding an internal ribosome entry site, cellular localization signals and epitope- and purification tags. Building blocks in the toolbox can be easily combined as they contain defined and tested Flexible Assembly Sequence Tags, FASTs. USER cloning with FASTs allows rapid swaps of gene, promoter or selection marker in existing plasmids and simple construction of vectors encoding proteins, which are fused to fluorescence-, purification-, localization-, or epitope tags. The mammalian expression vector assembly platform currently allows for the assembly of up to seven fragments in a single cloning step with correct directionality and with a cloning efficiency above 90%. The functionality of basic vectors for FAST assembly was tested and validated by transient expression of fluorescent model proteins in CHO, U-2-OS and HEK293 cell lines. In this test, we included many of the most common vector elements for heterologous gene expression in mammalian cells, in addition the system is fully extendable by other users. The vector system is designed to facilitate high-throughput genome-scale studies of mammalian cells, such as the newly sequenced CHO cell lines, through the ability to rapidly generate high-fidelity assembly of customizable gene expression vectors.

  9. Developmental and Subcellular Organization of Single-Cell C₄ Photosynthesis in Bienertia sinuspersici Determined by Large-Scale Proteomics and cDNA Assembly from 454 DNA Sequencing.

    Science.gov (United States)

    Offermann, Sascha; Friso, Giulia; Doroshenk, Kelly A; Sun, Qi; Sharpe, Richard M; Okita, Thomas W; Wimmer, Diana; Edwards, Gerald E; van Wijk, Klaas J

    2015-05-01

    Kranz C4 species strictly depend on separation of primary and secondary carbon fixation reactions in different cell types. In contrast, the single-cell C4 (SCC4) species Bienertia sinuspersici utilizes intracellular compartmentation including two physiologically and biochemically different chloroplast types; however, information on identity, localization, and induction of proteins required for this SCC4 system is currently very limited. In this study, we determined the distribution of photosynthesis-related proteins and the induction of the C4 system during development by label-free proteomics of subcellular fractions and leaves of different developmental stages. This was enabled by inferring a protein sequence database from 454 sequencing of Bienertia cDNAs. Large-scale proteome rearrangements were observed as C4 photosynthesis developed during leaf maturation. The proteomes of the two chloroplasts are different with differential accumulation of linear and cyclic electron transport components, primary and secondary carbon fixation reactions, and a triose-phosphate shuttle that is shared between the two chloroplast types. This differential protein distribution pattern suggests the presence of a mRNA or protein-sorting mechanism for nuclear-encoded, chloroplast-targeted proteins in SCC4 species. The combined information was used to provide a comprehensive model for NAD-ME type carbon fixation in SCC4 species.

  10. Self-assembly of bimetallic AuxPd1-x alloy nanoparticles via dewetting of bilayers through the systematic control of temperature, thickness, composition and stacking sequence

    Science.gov (United States)

    Kunwar, Sundar; Pandey, Puran; Sui, Mao; Bastola, Sushil; Lee, Jihoon

    2018-03-01

    Bimetallic alloy nanoparticles (NPs) are attractive materials for various applications with their morphology and elemental composition dependent optical, electronic, magnetic and catalytic properties. This work demonstrates the evolution of AuxPd1-x alloy nanostructures by the solid-state dewetting of sequentially deposited bilayers of Au and Pd on sapphire (0001). Various shape, size and configuration of AuxPd1‑x alloy NPs are fabricated by the systematic control of annealing temperature, deposition thickness, composition as well as stacking sequence. The evolution of alloy nanostructures is attributed to the surface diffusion, interface diffusion between bilayers, surface and interface energy minimization, Volmer-Weber growth model and equilibrium configuration. Depending upon the temperature, the surface morphologies evolve with the formation of pits, grains and voids and gradually develop into isolated semi-spherical alloy NPs by the expansion of voids and agglomeration of Au and Pd adatoms. On the other hand, small isolated to enlarged elongated and over-grown layer-like alloy nanostructures are fabricated due to the coalescence, partial diffusion and inter-diffusion with the increased bilayer thickness. In addition, the composition and stacking sequence of bilayers remarkably affect the final geometry of AuxPd1‑x nanostructures due to the variation in the dewetting process. The optical analysis based on the UV–vis-NIR reflectance spectra reveals the surface morphology dependent plasmonic resonance, scattering, reflection and absorption properties of AuxPd1‑x alloy nanostructures.

  11. String fragmentation; La fragmentation des cordes

    Energy Technology Data Exchange (ETDEWEB)

    Drescher, H.J.; Werner, K. [Laboratoire de Physique Subatomique et des Technologies Associees - SUBATECH, Centre National de la Recherche Scientifique, 44 - Nantes (France)

    1997-10-01

    The classical string model is used in VENUS as a fragmentation model. For the soft domain simple 2-parton strings were sufficient, whereas for higher energies up to LHC, the perturbative regime of the QCD gives additional soft gluons, which are mapped on the string as so called kinks, energy singularities between the leading partons. The kinky string model is chosen to handle fragmentation of these strings by application of the Lorentz invariant area law. The `kinky strings` model, corresponding to the perturbative gluons coming from pQCD, takes into consideration this effect by treating the partons and gluons on the same footing. The decay law is always the Artru-Menessier area law which is the most realistic since it is invariant to the Lorentz and gauge transformations. For low mass strings a manipulation of the rupture point is necessary if the string corresponds already to an elementary particle determined by the mass and the flavor content. By means of the fragmentation model it will be possible to simulate the data from future experiments at LHC and RHIC 3 refs.

  12. Subtype-independent near full-length HIV-1 genome sequencing and assembly to be used in large molecular epidemiological studies and clinical management.

    Science.gov (United States)

    Grossmann, Sebastian; Nowak, Piotr; Neogi, Ujjwal

    2015-01-01

    HIV-1 near full-length genome (HIV-NFLG) sequencing from plasma is an attractive multidimensional tool to apply in large-scale population-based molecular epidemiological studies. It also enables genotypic resistance testing (GRT) for all drug target sites allowing effective intervention strategies for control and prevention in high-risk population groups. Thus, the main objective of this study was to develop a simplified subtype-independent, cost- and labour-efficient HIV-NFLG protocol that can be used in clinical management as well as in molecular epidemiological studies. Plasma samples (n=30) were obtained from HIV-1B (n=10), HIV-1C (n=10), CRF01_AE (n=5) and CRF01_AG (n=5) infected individuals with minimum viral load >1120 copies/ml. The amplification was performed with two large amplicons of 5.5 kb and 3.7 kb, sequenced with 17 primers to obtain HIV-NFLG. GRT was validated against ViroSeq™ HIV-1 Genotyping System. After excluding four plasma samples with low-quality RNA, a total of 26 samples were attempted. Among them, NFLG was obtained from 24 (92%) samples with the lowest viral load being 3000 copies/ml. High (>99%) concordance was observed between HIV-NFLG and ViroSeq™ when determining the drug resistance mutations (DRMs). The N384I connection mutation was additionally detected by NFLG in two samples. Our high efficiency subtype-independent HIV-NFLG is a simple and promising approach to be used in large-scale molecular epidemiological studies. It will facilitate the understanding of the HIV-1 pandemic population dynamics and outline effective intervention strategies. Furthermore, it can potentially be applicable in clinical management of drug resistance by evaluating DRMs against all available antiretrovirals in a single assay.

  13. De novo assembly and characterization of global transcriptome of coconut palm (Cocos nucifera L.) embryogenic calli using Illumina paired-end sequencing.

    Science.gov (United States)

    Rajesh, M K; Fayas, T P; Naganeeswaran, S; Rachana, K E; Bhavyashree, U; Sajini, K K; Karun, Anitha

    2016-05-01

    Production and supply of quality planting material is significant to coconut cultivation but is one of the major constraints in coconut productivity. Rapid multiplication of coconut through in vitro techniques, therefore, is of paramount importance. Although somatic embryogenesis in coconut is a promising technique that will allow for the mass production of high quality palms, coconut is highly recalcitrant to in vitro culture. In order to overcome the bottlenecks in coconut somatic embryogenesis and to develop a repeatable protocol, it is imperative to understand, identify, and characterize molecular events involved in coconut somatic embryogenesis pathway. Transcriptome analysis (RNA-Seq) of coconut embryogenic calli, derived from plumular explants of West Coast Tall cultivar, was undertaken on an Illumina HiSeq 2000 platform. After de novo transcriptome assembly and functional annotation, we have obtained 40,367 transcripts which showed significant BLASTx matches with similarity greater than 40 % and E value of ≤10(-5). Fourteen genes known to be involved in somatic embryogenesis were identified. Quantitative real-time PCR (qRT-PCR) analyses of these 14 genes were carried in six developmental stages. The result showed that CLV was upregulated in the initial stage of callogenesis. Transcripts GLP, GST, PKL, WUS, and WRKY were expressed more in somatic embryo stage. The expression of SERK, MAPK, AP2, SAUR, ECP, AGP, LEA, and ANT were higher in the embryogenic callus stage compared to initial culture and somatic embryo stages. This study provides the first insights into the gene expression patterns during somatic embryogenesis in coconut.

  14. Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data.

    Science.gov (United States)

    Nishito, Yukari; Osana, Yasunori; Hachiya, Tsuyoshi; Popendorf, Kris; Toyoda, Atsushi; Fujiyama, Asao; Itaya, Mitsuhiro; Sakakibara, Yasubumi

    2010-04-16

    Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for gamma-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B. subtilis natto harbors but B. subtilis 168 lacks

  15. Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data

    Directory of Open Access Journals (Sweden)

    Fujiyama Asao

    2010-04-01

    Full Text Available Abstract Background Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. Results We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for γ-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. Conclusions The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B

  16. Dimensional crossover in fragmentation

    Science.gov (United States)

    Sotolongo-Costa, Oscar; Rodriguez, Arezky H.; Rodgers, G. J.

    2000-11-01

    Experiments in which thick clay plates and glass rods are fractured have revealed different behavior of fragment mass distribution function in the small and large fragment regions. In this paper we explain this behavior using non-extensive Tsallis statistics and show how the crossover between the two regions is caused by the change in the fragments’ dimensionality during the fracture process. We obtain a physical criterion for the position of this crossover and an expression for the change in the power-law exponent between the small and large fragment regions. These predictions are in good agreement with the experiments on thick clay plates.

  17. Comparison of bacterial genome assembly software for MinION data and their applicability to medical microbiology.

    Science.gov (United States)

    Judge, Kim; Hunt, Martin; Reuter, Sandra; Tracey, Alan; Quail, Michael A; Parkhill, Julian; Peacock, Sharon J

    2016-09-01

    Translating the Oxford Nanopore MinION sequencing technology into medical microbiology requires on-going analysis that keeps pace with technological improvements to the instrument and release of associated analysis software. Here, we use a multidrug-resistant Enterobacter kobei isolate as a model organism to compare open source software for the assembly of genome data, and relate this to the time taken to generate actionable information. Three software tools (PBcR, Canu and miniasm) were used to assemble MinION data and a fourth (SPAdes) was used to combine MinION and Illumina data to produce a hybrid assembly. All four had a similar number of contigs and were more contiguous than the assembly using Illumina data alone, with SPAdes producing a single chromosomal contig. Evaluation of the four assemblies to represent the genome structure revealed a single large inversion in the SPAdes assembly, which also incorrectly integrated a plasmid into the chromosomal contig. Almost 50 %, 80 % and 90 % of MinION pass reads were generated in the first 6, 9 and 12 h, respectively. Using data from the first 6 h alone led to a less accurate, fragmented assembly, but data from the first 9 or 12 h generated similar assemblies to that from 48 h sequencing. Assemblies were generated in 2 h using Canu, indicating that going from isolate to assembled data is possible in less than 48 h. MinION data identified that genes responsible for resistance were carried by two plasmids encoding resistance to carbapenem and to sulphonamides, rifampicin and aminoglycosides, respectively.

  18. Embedded Fragments Registry (EFR)

    Data.gov (United States)

    Department of Veterans Affairs — In 2009, the Department of Defense estimated that approximately 40,000 service members who served in OEF/OIF may have embedded fragment wounds as the result of small...

  19. Physics of projectile fragments

    International Nuclear Information System (INIS)

    Minamisono, Tadanori

    1982-01-01

    This is a study report on the polarization phenomena of the projectile fragments produced by heavy ion reactions, and the beta decay of fragments. The experimental project by using heavy ions with the energy from 50 MeV/amu to 250 MeV/amu was designed. Construction of an angle-dispersion spectrograph for projectile fragments was proposed. This is a two-stage spectrograph. The first stage is a QQDQQ type separator, and the second stage is QDQD type. Estimation shows that Co-66 may be separated from the nuclei with mass of 65 and 67. The orientation of fragments can be measured by detecting beta-ray. The apparatus consists of a uniform field magnet, an energy absorber, a stopper, a RF coil and a beta-ray hodoscope. This system can be used for not only this purpose but also for the measurement of hyperfine structure. (Kato, T.)

  20. Fragmentation Main Model

    Data.gov (United States)

    Earth Data Analysis Center, University of New Mexico — The fragmentation model combines patch size and patch continuity with diversity of vegetation types per patch and rarity of vegetation types per patch. A patch was...

  1. Stone fragmentation by ultrasound

    Indian Academy of Sciences (India)

    Unknown

    In the present work, enhancement of the kidney stone fragmentation by using ultrasound is studied. The cavi- ... ment system like radiation pressure balance, the power is given by ... Thus the bubble size has direct relationship with its life and.

  2. DNA fragmentation in spermatozoa

    DEFF Research Database (Denmark)

    Rex, A S; Aagaard, J.; Fedder, J

    2017-01-01

    Sperm DNA Fragmentation has been extensively studied for more than a decade. In the 1940s the uniqueness of the spermatozoa protein complex which stabilizes the DNA was discovered. In the fifties and sixties, the association between unstable chromatin structure and subfertility was investigated....... In the seventies, the impact of induced DNA damage was investigated. In the 1980s the concept of sperm DNA fragmentation as related to infertility was introduced as well as the first DNA fragmentation test: the Sperm Chromatin Structure Assay (SCSA). The terminal deoxynucleotidyl transferase nick end labelling...... (TUNEL) test followed by others was introduced in the nineties. The association between DNA fragmentation in spermatozoa and pregnancy loss has been extensively investigated spurring the need for a therapeutic tool for these patients. This gave rise to an increased interest in the aetiology of DNA damage...

  3. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Science.gov (United States)

    Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal

  4. Supramolecular gel electrophoresis of large DNA fragments.

    Science.gov (United States)

    Tazawa, Shohei; Kobayashi, Kazuhiro; Oyoshi, Takanori; Yamanaka, Masamichi

    2017-10-01

    Pulsed-field gel electrophoresis is a frequent technique used to separate exceptionally large DNA fragments. In a typical continuous field electrophoresis, it is challenging to separate DNA fragments larger than 20 kbp because they migrate at a comparable rate. To overcome this challenge, it is necessary to develop a novel matrix for the electrophoresis. Here, we describe the electrophoresis of large DNA fragments up to 166 kbp using a supramolecular gel matrix and a typical continuous field electrophoresis system. C 3 -symmetric tris-urea self-assembled into a supramolecular hydrogel in tris-boric acid-EDTA buffer, a typical buffer for DNA electrophoresis, and the supramolecular hydrogel was used as a matrix for electrophoresis to separate large DNA fragments. Three types of DNA marker, the λ-Hind III digest (2 to 23 kbp), Lambda DNA-Mono Cut Mix (10 to 49 kbp), and Marker 7 GT (10 to 165 kbp), were analyzed in this study. Large DNA fragments of greater than 100 kbp showed distinct mobility using a typical continuous field electrophoresis system. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. ReseqChip: Automated integration of multiple local context probe data from the MitoChip array in mitochondrial DNA sequence assembly

    Directory of Open Access Journals (Sweden)

    Spang Rainer

    2009-12-01

    Full Text Available Abstract Background The Affymetrix MitoChip v2.0 is an oligonucleotide tiling array for the resequencing of the human mitochondrial (mt genome. For each of 16,569 nucleotide positions of the mt genome it holds two sets of four 25-mer probes each that match the heavy and the light strand of a reference mt genome and vary only at their central position to interrogate all four possible alleles. In addition, the MitoChip v2.0 carries alternative local context probes to account for known mtDNA variants. These probes have been neglected in most studies due to the lack of software for their automated analysis. Results We provide ReseqChip, a free software that automates the process of resequencing mtDNA using multiple local context probes on the MitoChip v2.0. ReseqChip significantly improves base call rate and sequence accuracy. ReseqChip is available at http://code.open-bio.org/svnweb/index.cgi/bioperl/browse/bioperl-live/trunk/Bio/Microarray/Tools/. Conclusions ReseqChip allows for the automated consolidation of base calls from alternative local mt genome context probes. It thereby improves the accuracy of resequencing, while reducing the number of non-called bases.

  6. Fragment Impact Toolkit (FIT)

    Energy Technology Data Exchange (ETDEWEB)

    Shevitz, Daniel Wolf [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Key, Brian P. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Garcia, Daniel B. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2017-09-05

    The Fragment Impact Toolkit (FIT) is a software package used for probabilistic consequence evaluation of fragmenting sources. The typical use case for FIT is to simulate an exploding shell and evaluate the consequence on nearby objects. FIT is written in the programming language Python and is designed as a collection of interacting software modules. Each module has a function that interacts with the other modules to produce desired results.

  7. Molecular markers. Amplified fragment length polymorphism

    Directory of Open Access Journals (Sweden)

    Pržulj Novo

    2005-01-01

    Full Text Available Amplified Fragment Length Polymorphism molecular markers (AFLPs has been developed combining procedures of RFLPs and RAPDs molekular markers, i.e. the first step is restriction digestion of the genomic DNA that is followed by selective amplification of the restricted fragments. The advantage of the AFLP technique is that it allows rapid generation of a large number of reproducible markers. The reproducibility of AFLPs markers is assured by the use of restriction site-specific adapters and adapter-specific primers for PCR reaction. Only fragments containing the restriction site sequence plus the additional nucleotides will be amplified and the more selected nucleotides added on the primer sequence the fewer the number of fragments amplified by PCR. The amplified products are normally separated on a sequencing gel and visualized after exposure to X-ray film or by using fluorescent labeled primers. AFLP shave proven to be extremely proficient in revealing diversity at below the species level. A disadvantage of AFLP technique is that AFLPs are essentially a dominant marker system and not able to identify heterozygotes.

  8. Heart Rate Fragmentation: A Symbolic Dynamical Approach

    Directory of Open Access Journals (Sweden)

    Madalena D. Costa

    2017-11-01

    Full Text Available Background: We recently introduced the concept of heart rate fragmentation along with a set of metrics for its quantification. The term was coined to refer to an increase in the percentage of changes in heart rate acceleration sign, a dynamical marker of a type of anomalous variability. The effort was motivated by the observation that fragmentation, which is consistent with the breakdown of the neuroautonomic-electrophysiologic control system of the sino-atrial node, could confound traditional short-term analysis of heart rate variability.Objective: The objectives of this study were to: (1 introduce a symbolic dynamical approach to the problem of quantifying heart rate fragmentation; (2 evaluate how the distribution of the different dynamical patterns (“words” varied with the participants' age in a group of healthy subjects and patients with coronary artery disease (CAD; and (3 quantify the differences in the fragmentation patterns between the two sample populations.Methods: The symbolic dynamical method employed here was based on a ternary map of the increment NN interval time series and on the analysis of the relative frequency of symbolic sequences (words with a pre-defined set of features. We analyzed annotated, open-access Holter databases of healthy subjects and patients with CAD, provided by the University of Rochester Telemetric and Holter ECG Warehouse (THEW.Results: The degree of fragmentation was significantly higher in older individuals than in their younger counterparts. However, the fragmentation patterns were different in the two sample populations. In healthy subjects, older age was significantly associated with a higher percentage of transitions from acceleration/deceleration to zero acceleration and vice versa (termed “soft” inflection points. In patients with CAD, older age was also significantly associated with higher percentages of frank reversals in heart rate acceleration (transitions from acceleration to

  9. Rapid construction of a Bacterial Artificial Chromosomal (BAC) expression vector using designer DNA fragments.

    Science.gov (United States)

    Chen, Chao; Zhao, Xinqing; Jin, Yingyu; Zhao, Zongbao Kent; Suh, Joo-Won

    2014-11-01

    Bacterial artificial chromosomal (BAC) vectors are increasingly being used in cloning large DNA fragments containing complex biosynthetic pathways to facilitate heterologous production of microbial metabolites for drug development. To express inserted genes using Streptomyces species as the production hosts, an integration expression cassette is required to be inserted into the BAC vector, which includes genetic elements encoding a phage-specific attachment site, an integrase, an origin of transfer, a selection marker and a promoter. Due to the large sizes of DNA inserted into the BAC vectors, it is normally inefficient and time-consuming to assemble these fragments by routine PCR amplifications and restriction-ligations. Here we present a rapid method to insert fragments to construct BAC-based expression vectors. A DNA fragment of about 130 bp was designed, which contains upstream and downstream homologous sequences of both BAC vector and pIB139 plasmid carrying the whole integration expression cassette. In-Fusion cloning was performed using the designer DNA fragment to modify pIB139, followed by λ-RED-mediated recombination to obtain the BAC-based expression vector. We demonstrated the effectiveness of this method by rapid construction of a BAC-based expression vector with an insert of about 120 kb that contains the entire gene cluster for biosynthesis of immunosuppressant FK506. The empty BAC-based expression vector constructed in this study can be conveniently used for construction of BAC libraries using either microbial pure culture or environmental DNA, and the selected BAC clones can be directly used for heterologous expression. Alternatively, if a BAC library has already been constructed using a commercial BAC vector, the selected BAC vectors can be manipulated using the method described here to get the BAC-based expression vectors with desired gene clusters for heterologous expression. The rapid construction of a BAC-based expression vector facilitates

  10. Saturating representation of loop conformational fragments in structure databanks

    Directory of Open Access Journals (Sweden)

    Fiser András

    2006-07-01

    Full Text Available Abstract Background Short fragments of proteins are fundamental starting points in various structure prediction applications, such as in fragment based loop modeling methods but also in various full structure build-up procedures. The applicability and performance of these approaches depend on the availability of short fragments in structure databanks. Results We studied the representation of protein loop fragments up to 14 residues in length. All possible query fragments found in sequence databases (Sequence Space were clustered and cross referenced with available structural fragments in Protein Data Bank (Structure Space. We found that the expansion of PDB in the last few years resulted in a dense coverage of loop conformational fragments. For each loops of length 8 in the current Sequence Space there is at least one loop in Structure Space with 50% or higher sequence identity. By correlating sequence and structure clusters of loops we found that a 50% sequence identity generally guarantees structural similarity. These percentages of coverage at 50% sequence cutoff drop to 96, 94, 68, 53, 33 and 13% for loops of length 9, 10, 11, 12, 13, and 14, respectively. There is not a single loop in the current Sequence Space at any length up to 14 residues that is not matched with a conformational segment that shares at least 20% sequence identity. This minimum observed identity is 40% for loops of 12 residues or shorter and is as high as 50% for 10 residue or shorter loops. We also assessed the impact of rapidly growing sequence databanks on the estimated number of new loop conformations and found that while the number of sequentially unique sequence segments increased about six folds during the last five years there are almost no unique conformational segments among these up to 12 residues long fragments. Conclusion The results suggest that fragment based prediction approaches are not limited any more by the completeness of fragments in databanks but

  11. Designer genes. Recombinant antibody fragments for biological imaging

    Energy Technology Data Exchange (ETDEWEB)

    Wu, A.M.; Yazaki, P.J. [Beckman Research Institute of the City of Hope, Duarte, CA (United States). Dept. of Molecular Biology

    2000-09-01

    Monoclonal antibodies (MAbs), with high specificity and high affinity for their target antigens, can be utilized for delivery of agents such as radionuclides, enzymes, drugs or toxins in vivo. However, the implementation of radiolabeled antibodies as magic bullets for detection and treatment of diseases such as cancer has required addressing several shortcomings of murine MAbs. These include their immunogenicity, sub-optimal targeting and pharmacokinetic properties, and practical issues of production and radiolabeling. Genetic engineering provides a powerful approach for redesigning antibodies for use in oncologic applications in vivo. Recombinant fragments have been produced that retain high affinity for target antigens, and display a combination of rapid, high-level tumor targeting with concomitant clearance from normal tissues and the circulation in animal models. An important first step was cloning and engineering of antibody heavy and light chain variable domains into single-chain Fvs (molecular weight, 25-17 kDa), in which the variable regions are joined via a synthetic linker peptide sequence. Although scFvs themselves showed limited tumor uptake in preclinical and clinical studies, they provide a useful building block for intermediate sized recombinant fragments. Covalently linked dimers or non-covalent dimers of scFvs (also known as diabodies) show improved targeting and clearance properties due to their higher molecular weight (55kDa) and increased avidity. Further gains can be made by generation of larger recombinant fragments, such as the minibody, an scFv-C{sub H}3 fusion protein that self-assembles into a bivalent dimer of 80 kDa. A systematic evaluation of scFv, diabody, minibody, and intact antibody (based on comparison of tumor uptakes, tumor: blood activity ratios, and calculation of an Imaging Figure of Merit) can form the basis for selection of combinations of recombinant fragments and radionuclides for imaging applications. Ease of engineering

  12. Designer genes. Recombinant antibody fragments for biological imaging

    International Nuclear Information System (INIS)

    Wu, A.M.; Yazaki, P.J.

    2000-01-01

    Monoclonal antibodies (MAbs), with high specificy and high affinity for their target antigens, can be utilized for delivery of agents such as radionuclides, enzymes, drugs or toxins in vivo. However, the implementation of radiolabeled antibodies as magic bullets for detection and treatment of diseases such as cancer has required addressing several shortcomings of murine MAbs. These include their immunogenicity, sub-optimal targeting and pharmacokinetic properties, and practical issues of production and radiolabeling. Genetic engineering provides a powerful approach for redesigning antibodies for use in oncologic applications in vivo. Recombinant fragments have been produced that retain high affinity for target antigens, and display a combination of rapid, high-level tumor targeting with concomitant clearance from normal tissues and the circulation in animal models. An important first step was cloning and engineering of antibody heavy and light chain variable domains into single-chain Fvs (molecular weight, 25-17 kDa), in which the variable regions are joined via a synthetic linker peptide sequence. Although scFvs themselves showed limited tumor uptake in preclinical and clinical studies, they provide a useful building block for intermediate sized recombinant fragments. Covalently linked dimers or non-covalent dimers of scFvs (also known as diabodies) show improved targeting and clearance properties due to their higher molecular weight (55kDa) and increased avidity. Further gains can be made by generation of larger recombinant fragments, such as the minibody, an scFv-C H 3 fusion protein that self-assembles into a bivalent dimer of 80 kDa. A systematic evaluation of scFv, diabody, minibody, and intact antibody (based on comparison of tumor uptakes, tumor: blood activity ratios, and calculation of an Imaging Figure of Merit) can form the basis for selection of combinations of recombinant fragments and radionuclides for imaging applications. Ease of engineering and

  13. Gene Prediction in Metagenomic Fragments with Deep Learning

    Directory of Open Access Journals (Sweden)

    Shao-Wu Zhang

    2017-01-01

    Full Text Available Next generation sequencing technologies used in metagenomics yield numerous sequencing fragments which come from thousands of different species. Accurately identifying genes from metagenomics fragments is one of the most fundamental issues in metagenomics. In this article, by fusing multifeatures (i.e., monocodon usage, monoamino acid usage, ORF length coverage, and Z-curve features and using deep stacking networks learning model, we present a novel method (called Meta-MFDL to predict the metagenomic genes. The results with 10 CV and independent tests show that Meta-MFDL is a powerful tool for identifying genes from metagenomic fragments.

  14. Fragmentation of relativistic nuclei

    International Nuclear Information System (INIS)

    Cork, B.

    1975-06-01

    Nuclei with energies of several GeV/n interact with hadrons and produce fragments that encompass the fields of nuclear physics, meson physics, and particle physics. Experimental results are now available to explore problems in nuclear physics such as the validity of the shell model to explain the momentum distribution of fragments, the contribution of giant dipole resonances to fragment production cross sections, the effective Coulomb barrier, and nuclear temperatures. A new approach to meson physics is possible by exploring the nucleon charge-exchange process. Particle physics problems are explored by measuring the energy and target dependence of isotope production cross sections, thus determining if limiting fragmentation and target factorization are valid, and measuring total cross sections to determine if the factorization relation, sigma/sub AB/ 2 = sigma/sub AA/ . sigma/sub BB/, is violated. Also, new experiments have been done to measure the angular distribution of fragments that could be explained as nuclear shock waves, and to explore for ultradense matter produced by very heavy ions incident on heavy atoms. (12 figures, 2 tables)

  15. Plant X-tender: An extension of the AssemblX system for the assembly and expression of multigene constructs in plants

    Science.gov (United States)

    Machens, Fabian; Coll, Anna; Baebler, Špela; Messerschmidt, Katrin; Gruden, Kristina

    2018-01-01

    Cloning multiple DNA fragments for delivery of several genes of interest into the plant genome is one of the main technological challenges in plant synthetic biology. Despite several modular assembly methods developed in recent years, the plant biotechnology community has not widely adopted them yet, probably due to the lack of appropriate vectors and software tools. Here we present Plant X-tender, an extension of the highly efficient, scar-free and sequence-independent multigene assembly strategy AssemblX, based on overlap-depended cloning methods and rare-cutting restriction enzymes. Plant X-tender consists of a set of plant expression vectors and the protocols for most efficient cloning into the novel vector set needed for plant expression and thus introduces advantages of AssemblX into plant synthetic biology. The novel vector set covers different backbones and selection markers to allow full design flexibility. We have included ccdB counterselection, thereby allowing the transfer of multigene constructs into the novel vector set in a straightforward and highly efficient way. Vectors are available as empty backbones and are fully flexible regarding the orientation of expression cassettes and addition of linkers between them, if required. We optimised the assembly and subcloning protocol by testing different scar-less assembly approaches: the noncommercial SLiCE and TAR methods and the commercial Gibson assembly and NEBuilder HiFi DNA assembly kits. Plant X-tender was applicable even in combination with low efficient homemade chemically competent or electrocompetent Escherichia coli. We have further validated the developed procedure for plant protein expression by cloning two cassettes into the newly developed vectors and subsequently transferred them to Nicotiana benthamiana in a transient expression setup. Thereby we show that multigene constructs can be delivered into plant cells in a streamlined and highly efficient way. Our results will support faster

  16. Plant X-tender: An extension of the AssemblX system for the assembly and expression of multigene constructs in plants.

    Science.gov (United States)

    Lukan, Tjaša; Machens, Fabian; Coll, Anna; Baebler, Špela; Messerschmidt, Katrin; Gruden, Kristina

    2018-01-01

    Cloning multiple DNA fragments for delivery of several genes of interest into the plant genome is one of the main technological challenges in plant synthetic biology. Despite several modular assembly methods developed in recent years, the plant biotechnology community has not widely adopted them yet, probably due to the lack of appropriate vectors and software tools. Here we present Plant X-tender, an extension of the highly efficient, scar-free and sequence-independent multigene assembly strategy AssemblX, based on overlap-depended cloning methods and rare-cutting restriction enzymes. Plant X-tender consists of a set of plant expression vectors and the protocols for most efficient cloning into the novel vector set needed for plant expression and thus introduces advantages of AssemblX into plant synthetic biology. The novel vector set covers different backbones and selection markers to allow full design flexibility. We have included ccdB counterselection, thereby allowing the transfer of multigene constructs into the novel vector set in a straightforward and highly efficient way. Vectors are available as empty backbones and are fully flexible regarding the orientation of expression cassettes and addition of linkers between them, if required. We optimised the assembly and subcloning protocol by testing different scar-less assembly approaches: the noncommercial SLiCE and TAR methods and the commercial Gibson assembly and NEBuilder HiFi DNA assembly kits. Plant X-tender was applicable even in combination with low efficient homemade chemically competent or electrocompetent Escherichia coli. We have further validated the developed procedure for plant protein expression by cloning two cassettes into the newly developed vectors and subsequently transferred them to Nicotiana benthamiana in a transient expression setup. Thereby we show that multigene constructs can be delivered into plant cells in a streamlined and highly efficient way. Our results will support faster

  17. Thread extraction for polyadic instruction sequences

    NARCIS (Netherlands)

    Bergstra, J.; Middelburg, C.

    2011-01-01

    In this paper, we study the phenomenon that instruction sequences are split into fragments which somehow produce a joint behaviour. In order to bring this phenomenon better into the picture, we formalize a simple mechanism by which several instruction sequence fragments can produce a joint

  18. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  19. Draft genome sequence of ramie, Boehmeria nivea (L.) Gaudich.

    Science.gov (United States)

    Luan, Ming-Bao; Jian, Jian-Bo; Chen, Ping; Chen, Jun-Hui; Chen, Jian-Hua; Gao, Qiang; Gao, Gang; Zhou, Ju-Hong; Chen, Kun-Mei; Guang, Xuan-Min; Chen, Ji-Kang; Zhang, Qian-Qian; Wang, Xiao-Fei; Fang, Long; Sun, Zhi-Min; Bai, Ming-Zhou; Fang, Xiao-Dong; Zhao, Shan-Cen; Xiong, He-Ping; Yu, Chun-Ming; Zhu, Ai-Guo

    2018-05-01

    Ramie, Boehmeria nivea (L.) Gaudich, family Urticaceae, is a plant native to eastern Asia, and one of the world's oldest fibre crops. It is also used as animal feed and for the phytoremediation of heavy metal-contaminated farmlands. Thus, the genome sequence of ramie was determined to explore the molecular basis of its fibre quality, protein content and phytoremediation. For further understanding ramie genome, different paired-end and mate-pair libraries were combined to generate 134.31 Gb of raw DNA sequences using the Illumina whole-genome shotgun sequencing approach. The highly heterozygous B. nivea genome was assembled using the Platanus Genome Assembler, which is an effective tool for the assembly of highly heterozygous genome sequences. The final length of the draft genome of this species was approximately 341.9 Mb (contig N50 = 22.62 kb, scaffold N50 = 1,126.36 kb). Based on ramie genome annotations, 30,237 protein-coding genes were predicted, and the repetitive element content was 46.3%. The completeness of the final assembly was evaluated by benchmarking universal single-copy orthologous genes (BUSCO); 90.5% of the 1,440 expected embryophytic genes were identified as complete, and 4.9% were identified as fragmented. Phylogenetic analysis based on single-copy gene families and one-to-one orthologous genes placed ramie with mulberry and cannabis, within the clade of urticalean rosids. Genome information of ramie will be a valuable resource for the conservation of endangered Boehmeria species and for future studies on the biogeography and characteristic evolution of members of Urticaceae. © 2018 John Wiley & Sons Ltd.

  20. Fine mapping of powdery mildew resistance genes PmTb7A.1 and PmTb7A.2 in Triticum boeoticum (Boiss.) using the shotgun sequence assembly of chromosome 7AL.

    Science.gov (United States)

    Chhuneja, Parveen; Yadav, Bharat; Stirnweis, Daniel; Hurni, Severine; Kaur, Satinder; Elkot, Ahmed Fawzy; Keller, Beat; Wicker, Thomas; Sehgal, Sunish; Gill, Bikram S; Singh, Kuldeep

    2015-10-01

    A novel powdery mildew resistance gene and a new allele of Pm1 were identified and fine mapped. DNA markers suitable for marker-assisted selection have been identified. Powdery mildew caused by Blumeria graminis is one of the most important foliar diseases of wheat and causes significant yield losses worldwide. Diploid A genome species are an important genetic resource for disease resistance genes. Two powdery mildew resistance genes, identified in Triticum boeoticum (A(b)A(b)) accession pau5088, PmTb7A.1 and PmTb7A.2 were mapped on chromosome 7AL. In the present study, shotgun sequence assembly data for chromosome 7AL were utilised for fine mapping of these Pm resistance genes. Forty SSR, 73 resistance gene analogue-based sequence-tagged sites (RGA-STS) and 36 single nucleotide polymorphism markers were designed for fine mapping of PmTb7A.1 and PmTb7A.2. Twenty-one RGA-STS, 8 SSR and 13 SNP markers were mapped to 7AL. RGA-STS markers Ta7AL-4556232 and 7AL-4426363 were linked to the PmTb7A.1 and PmTb7A.2, at a genetic distance of 0.6 and 6.0 cM, respectively. The present investigation established that PmTb7A.1 is a new powdery mildew resistance gene that confers resistance to a broad range of Bgt isolates, whereas PmTb7A.2 most probably is a new allele of Pm1 based on chromosomal location and screening with Bgt isolates showing differential reaction on lines with different Pm1 alleles. The markers identified to be linked to the two Pm resistance genes are robust and can be used for marker-assisted introgression of these genes to hexaploid wheat.

  1. Land fragmentation and production diversification

    NARCIS (Netherlands)

    Ciaian, Pavel; Guri, Fatmir; Rajcaniova, Miroslava; Drabik, Dusan; Paloma, Sergio Gomez Y.

    2018-01-01

    We analyze the impact of land fragmentation on production diversification in rural Albania. Albania represents a particularly interesting case for studying land fragmentation as the fragmentation is a direct outcome of land reforms. The results indicate that land fragmentation is an important driver

  2. Heavy fragment radioactivity

    International Nuclear Information System (INIS)

    Silisteanu, I.

    1991-06-01

    The effect of collective mode excitation in heavy fragment radioactivity (HFR) is explored and discussed in the light of current experimental data. It is found that the coupling and resonance effects in fragment interaction and also the proper angular momentum effects may lead to an important enhancing of the emission process. New useful procedures are proposed for the study of nuclear decay properties. The relations between different decay processes are investigated in detail. We are also trying to understand and explain in a unified way the reaction mechanisms in decay phenomena. (author). 17 refs, 4 figs, 3 tabs

  3. [Sequencing and analysis of the complete genome of a rabies virus isolate from Sika deer].

    Science.gov (United States)

    Zhao, Yun-Jiao; Guo, Li; Huang, Ying; Zhang, Li-Shi; Qian, Ai-Dong

    2008-05-01

    One DRV strain was isolated from Sika Deer brain and sequenced. Nine overlapped gene fragments were amplified by RT-PCR through 3'-RACE and 5'-RACE method, and the complete DRV genome sequence was assembled. The length of the complete genome is 11863bp. The DRV genome organization was similar to other rabies viruses which were composed of five genes and the initiation sites and termination sites were highly conservative. There were mutated amino acids in important antigen sites of nucleoprotein and glycoprotein. The nucleotide and amino acid homologies of gene N, P, M, G, L in strains with completed genomie sequencing were compared. Compared with N gene sequence of other typical rabies viruses, a phylogenetic tree was established . These results indicated that DRV belonged to gene type 1. The highest homology compared with Chinese vaccine strain 3aG was 94%, and the lowest was 71% compared with WCBV. These findings provided theoretical reference for further research in rabies virus.

  4. Rock fragmentation control in opencast blasting

    Directory of Open Access Journals (Sweden)

    P.K. Singh

    2016-04-01

    Full Text Available The blasting operation plays a pivotal role in the overall economics of opencast mines. The blasting sub-system affects all the other associated sub-systems, i.e. loading, transport, crushing and milling operations. Fragmentation control through effective blast design and its effect on productivity are the challenging tasks for practicing blasting engineer due to inadequate knowledge of actual explosive energy released in the borehole, varying initiation practice in blast design and its effect on explosive energy release characteristic. This paper describes the result of a systematic study on the impact of blast design parameters on rock fragmentation at three mines in India. The mines use draglines and shovel–dumper combination for removal of overburden. Despite its pivotal role in controlling the overall economics of a mining operation, the expected blasting performance is often judged almost exclusively on the basis of poorly defined parameters such as powder factor and is often qualitative which results in very subjective assessment of blasting performance. Such an approach is very poor substitutes for accurate assessment of explosive and blasting performance. Ninety one blasts were conducted with varying blast designs and charging patterns, and their impacts on the rock fragmentation were documented. A high-speed camera was deployed to record the detonation sequences of the blasts. The efficiency of the loading machines was also correlated with the mean fragment size obtained from the fragmentation analyses.

  5. Fragger: a protein fragment picker for structural queries.

    Science.gov (United States)

    Berenger, Francois; Simoncini, David; Voet, Arnout; Shrestha, Rojan; Zhang, Kam Y J

    2017-01-01

    Protein modeling and design activities often require querying the Protein Data Bank (PDB) with a structural fragment, possibly containing gaps. For some applications, it is preferable to work on a specific subset of the PDB or with unpublished structures. These requirements, along with specific user needs, motivated the creation of a new software to manage and query 3D protein fragments. Fragger is a protein fragment picker that allows protein fragment databases to be created and queried. All fragment lengths are supported and any set of PDB files can be used to create a database. Fragger can efficiently search a fragment database with a query fragment and a distance threshold. Matching fragments are ranked by distance to the query. The query fragment can have structural gaps and the allowed amino acid sequences matching a query can be constrained via a regular expression of one-letter amino acid codes. Fragger also incorporates a tool to compute the backbone RMSD of one versus many fragments in high throughput. Fragger should be useful for protein design, loop grafting and related structural bioinformatics tasks.

  6. PELE fragmentation dynamics

    NARCIS (Netherlands)

    Verreault, J.; Hinsberg, N.P. van; Abadjieva, E.

    2013-01-01

    An analytical model that describes the PELE fragmentation dynamics is presented and compared with experimental results from literature. The model accounts for strong shock effects and detailed interactions taking place between the filling – the inner core of the ammunition – and the target

  7. Cryobiology of coral fragments.

    Science.gov (United States)

    Hagedorn, Mary; Farrell, Ann; Carter, Virginia L

    2013-02-01

    Around the world, coral reefs are dying due to human influences, and saving habitat alone may not stop this destruction. This investigation focused on the biological processes that will provide the first steps in understanding the cryobiology of whole coral fragments. Coral fragments are a partnership of coral tissue and endosymbiotic algae, Symbiodinium sp., commonly called zooxanthellae. These data reflected their separate sensitivities to chilling and a cryoprotectant (dimethyl sulfoxide) for the coral Pocillopora damicornis, as measured by tissue loss and Pulse Amplitude Modulated fluorometry 3weeks post-treatment. Five cryoprotectant treatments maintained the viability of the coral tissue and zooxanthellae at control values (1M dimethyl sulfoxide at 1.0, 1.5 and 2.0h exposures, and 1.5M dimethyl sulfoxide at 1.0 and 1.5h exposures, P>0.05, ANOVA), whereas 2M concentrations did not (Pzooxanthellae. During the winter when the fragments were chilled, the coral tissue remained relatively intact (∼25% loss) post-treatment, but the zooxanthellae numbers in the tissue declined after 5min of chilling (Pzooxanthellae numbers declined in response to chilling alone (P0.05, ANOVA), but it did not protect against the loss of zooxanthellae (Pzooxanthellae are the most sensitive element in the coral fragment complex and future cryopreservation protocols must be guided by their greater sensitivity. Copyright © 2012 Elsevier Inc. All rights reserved.

  8. Fragments of the Past

    OpenAIRE

    Peter Szende; Annie Holcombe

    2016-01-01

    With travel being made more accessible throughout the decades, the hospitality industry constantly evolved their practices as society and technology progressed. Hotels looked for news ways up service their customers, which led to the invention of the Servidor in 1918. Once revolutionary innovations have gone extinct, merely becoming fragments of the past.

  9. Synthesis of arabinoxylan fragments

    DEFF Research Database (Denmark)

    Underlin, Emilie Nørmølle; Böhm, Maximilian F.; Madsen, Robert

    , or production of commercial chemicals which are mainly obtained from fossil fuels today.The arbinoxylan fragments have a backbone of β-1,4-linked xylans with α-L-arabinose units attached at specific positions. The synthesis ultilises an efficient synthetic route, where all the xylan units can be derived from D...

  10. Fragmented Work Stories

    DEFF Research Database (Denmark)

    Humle, Didde Maria; Reff Pedersen, Anne

    2015-01-01

    stories. We argue that meaning by story making is not always created by coherence and causality; meaning is created by different types of fragmentation: discontinuities, tensions and editing. The objective of this article is to develop and advance antenarrative practice analysis of work stories...

  11. Fragments of the Past

    Directory of Open Access Journals (Sweden)

    Peter Szende

    2016-10-01

    Full Text Available With travel being made more accessible throughout the decades, the hospitality industry constantly evolved their practices as society and technology progressed. Hotels looked for news ways up service their customers, which led to the invention of the Servidor in 1918. Once revolutionary innovations have gone extinct, merely becoming fragments of the past.

  12. Detection of bacterial contaminants and hybrid sequences in the genome of the kelp Saccharina japonica using Taxoblast

    Directory of Open Access Journals (Sweden)

    Simon M. Dittami

    2017-11-01

    Full Text Available Modern genome sequencing strategies are highly sensitive to contamination making the detection of foreign DNA sequences an important part of analysis pipelines. Here we use Taxoblast, a simple pipeline with a graphical user interface, for the post-assembly detection of contaminating sequences in the published genome of the kelp Saccharina japonica. Analyses were based on multiple blastn searches with short sequence fragments. They revealed a number of probable bacterial contaminations as well as hybrid scaffolds that contain both bacterial and algal sequences. This or similar types of analysis, in combination with manual curation, may thus constitute a useful complement to standard bioinformatics analyses prior to submission of genomic data to public repositories. Our analysis pipeline is open-source and freely available at http://sdittami.altervista.org/taxoblast and via SourceForge (https://sourceforge.net/projects/taxoblast.

  13. Fragmentation in rotating isothermal protostellar clouds

    International Nuclear Information System (INIS)

    Bodenheimer, P.; Black, D.C.

    1980-01-01

    In this paper we report briefly the results of an extensive set of 3-D hydrodynamic calculations that have been performed during the past two and one-half years to investigate the susceptibility of rotating clouds to gravitational fragmentation. Because of the immensity of parameter space and the expense of computations, we have chosen to restrict this investigation to strictly isothermal collapse sequences. (orig./WL)

  14. Illustrating how mechanical assemblies work

    KAUST Repository

    Mitra, Niloy J.; Yang, Yongliang; Yan, Dongming; Li, Wilmot; Agrawala, Maneesh

    2010-01-01

    How things work visualizations use a variety of visual techniques to depict the operation of complex mechanical assemblies. We present an automated approach for generating such visualizations. Starting with a 3D CAD model of an assembly, we first infer the motions of individual parts and the interactions between parts based on their geometry and a few user specified constraints. We then use this information to generate visualizations that incorporate motion arrows, frame sequences and animation to convey the causal chain of motions and mechanical interactions between parts. We present results for a wide variety of assemblies. © 2010 ACM.

  15. Illustrating how mechanical assemblies work

    KAUST Repository

    Mitra, Niloy J.; Yang, Yongliang; Yan, Dongming; Li, Wilmot; Agrawala, Maneesh

    2013-01-01

    How-things-work visualizations use a variety of visual techniques to depict the operation of complex mechanical assemblies. We present an automated approach for generating such visualizations. Starting with a 3D CAD model of an assembly, we first infer the motions of the individual parts and the interactions across the parts based on their geometry and a few user-specified constraints. We then use this information to generate visualizations that incorporate motion arrows, frame sequences, and animation to convey the causal chain of motions and mechanical interactions across parts. We demonstrate our system on a wide variety of assemblies. © 2013 ACM 0001-0782/13/01.

  16. Illustrating how mechanical assemblies work

    KAUST Repository

    Mitra, Niloy J.

    2010-07-26

    How things work visualizations use a variety of visual techniques to depict the operation of complex mechanical assemblies. We present an automated approach for generating such visualizations. Starting with a 3D CAD model of an assembly, we first infer the motions of individual parts and the interactions between parts based on their geometry and a few user specified constraints. We then use this information to generate visualizations that incorporate motion arrows, frame sequences and animation to convey the causal chain of motions and mechanical interactions between parts. We present results for a wide variety of assemblies. © 2010 ACM.

  17. Fuel assembly

    International Nuclear Information System (INIS)

    Abe, Hideaki; Sakai, Takao; Ishida, Tomio; Yokota, Norikatsu.

    1992-01-01

    The lower ends of a plurality of plate-like shape memory alloys are secured at the periphery of the upper inside of the handling head of a fuel assembly. As the shape memory alloy, a Cu-Zn alloy, a Ti-Pd alloy or a Fe-Ni alloy is used. When high temperature coolants flow out to the handling head, the shape memory alloy deforms by warping to the outer side more greatly toward the upper portion thereof with the temperature increase of the coolants. As the result, the shape of the flow channel of the coolants is changed so as to enlarge at the exit of the upper end of the fuel assembly. Then, the pressure loss of the coolants in the fuel assembly is decreased by the enlargement. Accordingly, the flow rate of the coolants in the fuel assembly is increased to lower the temperature of the coolants. Further, high temperature coolants and low temperature coolants are mixed sufficiently just above the fuel assembly. This can suppress the temperature fluctuation of the mixed coolants in the upper portion of the reactor core, thereby enabling to decrease a fatigue and failures of the structural components in the upper portion of the reactor core. (I.N.)

  18. Fuel assembly

    International Nuclear Information System (INIS)

    Nakatsuka, Masafumi; Matsuzuka, Ryuji.

    1976-01-01

    Object: To provide a fuel assembly which can decrease pressure loss of coolant to uniform temperature. Structure: A sectional area of a flow passage in the vicinity of an inner peripheral surface of a wrapper tube is limited over the entire length to prevent the temperature of a fuel element in the outermost peripheral portion from being excessively decreased to thereby flatten temperature distribution. To this end, a plurality of pincture-frame-like sheet metals constituting a spacer for supporting a fuel assembly, which has a plurality of fuel elements planted lengthwise and in given spaced relation within the wrapper tube, is disposed in longitudinal grooves and in stacked fashion to form a substantially honeycomb-like space in cross section. The fuel elements are inserted and supported in the space to form a fuel assembly. (Kamimura, M.)

  19. Fuel assemblies

    International Nuclear Information System (INIS)

    Nagano, Mamoru; Yoshioka, Ritsuo

    1983-01-01

    Purpose: To effectively utilize nuclear fuels by increasing the reactivity of a fuel assembly and reduce the concentration at the central region thereof upon completion of the burning. Constitution: A fuel assembly is bisected into a central region and a peripheral region by disposing an inner channel box within a channel box. The flow rate of coolants passing through the central region is made greater than that in the peripheral region. The concentration of uranium 235 of the fuel rods in the central region is made higher. In such a structure, since the moderating effect in the central region is improved, the reactivity of the fuel assembly is increased and the uranium concentration in the central region upon completion of the burning can be reduced, fuel economy and effective utilization of uranium can be attained. (Kamimura, M.)

  20. Reconstruction of Banknote Fragments Based on Keypoint Matching Method.

    Science.gov (United States)

    Gwo, Chih-Ying; Wei, Chia-Hung; Li, Yue; Chiu, Nan-Hsing

    2015-07-01

    Banknotes may be shredded by a scrap machine, ripped up by hand, or damaged in accidents. This study proposes an image registration method for reconstruction of multiple sheets of banknotes. The proposed method first constructs different scale spaces to identify keypoints in the underlying banknote fragments. Next, the features of those keypoints are extracted to represent their local patterns around keypoints. Then, similarity is computed to find the keypoint pairs between the fragment and the reference banknote. The banknote fragments can determine the coordinate and amend the orientation. Finally, an assembly strategy is proposed to piece multiple sheets of banknote fragments together. Experimental results show that the proposed method causes, on average, a deviation of 0.12457 ± 0.12810° for each fragment while the SIFT method deviates 1.16893 ± 2.35254° on average. The proposed method not only reconstructs the banknotes but also decreases the computing cost. Furthermore, the proposed method can estimate relatively precisely the orientation of the banknote fragments to assemble. © 2015 American Academy of Forensic Sciences.

  1. Fragments of Time

    DEFF Research Database (Denmark)

    Christiansen, Steen Ledet

    Time travel films necessarily fragment linear narratives, as scenes are revisited with differences from the first time we saw it. Popular films such as Back to the Future mine comedy from these visitations, but there are many different approaches. One extreme is Chris Marker's La Jetée - a film...... made almost completely of still images, recounting the end of the world. These stills can be viewed as fragments that have survived the end of the world and now provide the only access to the events that occured. Shane Carruth's Primer has a different approach to time travel, the narrative diegesis...... that is presented; how do we understand such films and to what extent is it even possible to make sense of a film that has no real beginning, middle or end?...

  2. The A, C, G, and T of Genome Assembly

    Directory of Open Access Journals (Sweden)

    Bilal Wajid

    2016-01-01

    Full Text Available Genome assembly in its two decades of history has produced significant research, in terms of both biotechnology and computational biology. This contribution delineates sequencing platforms and their characteristics, examines key steps involved in filtering and processing raw data, explains assembly frameworks, and discusses quality statistics for the assessment of the assembled sequence. Furthermore, the paper explores recent Ubuntu-based software environments oriented towards genome assembly as well as some avenues for future research.

  3. The A, C, G, and T of Genome Assembly.

    Science.gov (United States)

    Wajid, Bilal; Sohail, Muhammad U; Ekti, Ali R; Serpedin, Erchin

    2016-01-01

    Genome assembly in its two decades of history has produced significant research, in terms of both biotechnology and computational biology. This contribution delineates sequencing platforms and their characteristics, examines key steps involved in filtering and processing raw data, explains assembly frameworks, and discusses quality statistics for the assessment of the assembled sequence. Furthermore, the paper explores recent Ubuntu-based software environments oriented towards genome assembly as well as some avenues for future research.

  4. Fragmentation of atomic systems

    International Nuclear Information System (INIS)

    Bohn, J.L.; Fano, U.

    1996-01-01

    We report recent progress toward a nonperturbative formulation of many-body quantum dynamics that treats all constituent particles on an equal footing. This formulation is capable of detailing the evolution of a system toward the diverse fragments into which it can break up. We illustrate the general concept with the simple example of the simultaneous excitation of both electrons in a helium atom. copyright 1996 The American Physical Society

  5. Modelling the fragmentation mechanisms

    International Nuclear Information System (INIS)

    Bougault, R.; Durand, D.; Gulminelli, F.

    1998-01-01

    We have investigated the role of high amplitude collective motion in the nuclear fragmentation by using semi-classical macroscopic, as well as, microscopic simulations (BUU). These studies are motivated by the search of instabilities responsible for nuclear fragmentation. Two cases were examined: the bubble formation following the collective expansion of the compressed nucleus in case of very central reactions and, in the case of the semi-central collisions, the fast fission of the two partners issued from a binary reaction, in their corresponding Coulomb field. In the two cases the fragmentation channel is dominated by the inter-relation between the Coulomb and nuclear fields, and it is possible to obtain semi-quantitative predictions as functions of interaction parameters. The transport equations of BUU type predicts for central reactions formation of a high density transient state. Of much interest is the mechanism subsequent to de-excitation. It seems reasonable to conceive that the pressure stocked in the compressional mode manifests itself as a collective expansion of the system. As the pressure is a increasing function of the available energy one can conceive a variety of energy depending exit channels, starting from the fragmentation due the amplification of fluctuations interior to the spinodal zone up to the complete vaporization of the highly excited system. If the reached pressure is sufficiently high the reaction final state may preserve the memory of the entrance channel as a collective radial energy superimposed to the thermal disordered motion. Distributions of particles in the configuration space for both central and semi-central reactions for the Pb+Au system are presented. The rupture time is estimated to the order of 300 fm/c, and is strongly dependent on the initial temperature. The study of dependence of the rupture time on the interaction parameters is under way

  6. Hot nuclei and fragmentation

    International Nuclear Information System (INIS)

    Guerreau, D.

    1993-01-01

    A review is made of the present status concerning the production of nuclei above 5 MeV temperature. Considerable progress has been made recently on the understanding of the formation and the fate of such hot nuclei. It appears that the nucleus seems more stable against temperature than predicted by static calculations. However, the occurrence of multifragment production at high excitation energies is now well established. The various experimental features of the fragmentation process are discussed. (author) 59 refs., 12 figs

  7. Excited nuclei fragmentation

    International Nuclear Information System (INIS)

    Ngo, C.

    1986-11-01

    Experimental indications leading to the thought of a very excited nucleus fragmentation are resumed. Theoretical approaches are briefly described; they are used to explain the phenomenon in showing off they are based on a minimum information principle. This model is based on time dependent Thomas-Fermi calculation which allows the mean field effect description, and with a site-bound percolation model which allows the fluctuation description [fr

  8. Multiple tag labeling method for DNA sequencing

    Science.gov (United States)

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  9. Assembling the Marine Metagenome, One Cell at a Time

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Xie, Gary; Copeland, Alex; Gonzalez, Jose M.; Han, Cliff; Kiss, Hajnalka; Saw, Jimmy H.; Senin, Pavel; Yang, Chi; Chatterji, Sourav; Cheng, Jan-Fang; Eisen, Jonathan A.; Sieracki, Michael E.; Stepanauskas, Ramunas

    2010-06-24

    The difficulty associated with the cultivation of most microorganisms and the complexity of natural microbial assemblages, such as marine plankton or human microbiome, hinder genome reconstruction of representative taxa using cultivation or metagenomic approaches. Here we used an alternative, single cell sequencing approach to obtain high-quality genome assemblies of two uncultured, numerically significant marine microorganisms. We employed fluorescence-activated cell sorting and multiple displacement amplification to obtain hundreds of micrograms of genomic DNA from individual, uncultured cells of two marine flavobacteria from the Gulf of Maine that were phylogenetically distant from existing cultured strains. Shotgun sequencing and genome finishing yielded 1.9 Mbp in 17 contigs and 1.5 Mbp in 21 contigs for the two flavobacteria, with estimated genome recoveries of about 91percent and 78percent, respectively. Only 0.24percent of the assembling sequences were contaminants and were removed from further analysis using rigorous quality control. In contrast to all cultured strains of marine flavobacteria, the two single cell genomes were excellent Global Ocean Sampling (GOS) metagenome fragment recruiters, demonstrating their numerical significance in the ocean. The geographic distribution of GOS recruits along the Northwest Atlantic coast coincided with ocean surface currents. Metabolic reconstruction indicated diverse potential energy sources, including biopolymer degradation, proteorhodopsin photometabolism, and hydrogen oxidation. Compared to cultured relatives, the two uncultured flavobacteria have small genome sizes, few non-coding nucleotides, and few paralogous genes, suggesting adaptations to narrow ecological niches. These features may have contributed to the abundance of the two taxa in specific regions of the ocean, and may have hindered their cultivation. We demonstrate the power of single cell DNA sequencing to generate reference genomes of uncultured

  10. Valve assembly

    International Nuclear Information System (INIS)

    Sandling, M.

    1981-01-01

    An improved valve assembly, used for controlling the flow of radioactive slurry, is described. Radioactive contamination of the air during removal or replacement of the valve is prevented by sucking air from the atmosphere through a portion of the structure above the valve housing. (U.K.)

  11. Fuel assembly

    International Nuclear Information System (INIS)

    Gjertsen, R.K.; Bassler, E.A.; Huckestein, E.A.; Salton, R.B.; Tower, S.N.

    1988-01-01

    A fuel assembly adapted for use with a pressurized water nuclear reactor having capabilities for fluid moderator spectral shift control is described comprising: parallel arranged elongated nuclear fuel elements; means for providing for axial support of the fuel elements and for arranging the fuel elements in a spaced array; thimbles interspersed among the fuel elements adapted for insertion of a rod control cluster therewithin; means for structurally joining the fuel elements and the guide thimbles; fluid moderator control means for providing a volume of low neutron absorbing fluid within the fuel assembly and for removing a substantially equivalent volume of reactor coolant water therefrom, a first flow manifold at one end of the fuel assembly sealingly connected to a first end of the moderator control tubes whereby the first ends are commonly flow connected; and a second flow manifold, having an inlet passage and an outlet passage therein, sealingly connected to a second end of the moderator control tubes at a second end of the fuel assembly

  12. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

    Science.gov (United States)

    Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

    2017-10-17

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.

  13. Azimuthal Anisotropies in Nuclear Fragmentation

    International Nuclear Information System (INIS)

    Dabrowska, A.; Szarska, M.; Trzupek, A.; Wolter, W.; Wosiek, B.

    2002-01-01

    The directed and elliptic flow of fragments emitted from the excited projectile nuclei has been observed for 158 AGeV Pb collisions with the lead and plastic targets. For comparison the flow analysis has been performed for 10.6 AGeV Au collisions with the emulsion target. The strong directed flow of heaviest fragments is found. Light fragments exhibit directed flow opposite to that of heavy fragments. The elliptic flow for all multiply charged fragments is positive and increases with the charge of the fragment. The observed flow patterns in the fragmentation of the projectile nucleus are practically independent of the mass of the target nucleus and the collision energy. Emission of fragments in nuclear multifragmentation shows similar, although weaker, flow effects. (author)

  14. [Complete genome sequencing of polymalic acid-producing strain Aureobasidium pullulans CCTCC M2012223].

    Science.gov (United States)

    Wang, Yongkang; Song, Xiaodan; Li, Xiaorong; Yang, Sang-tian; Zou, Xiang

    2017-01-04

    To explore the genome sequence of Aureobasidium pullulans CCTCC M2012223, analyze the key genes related to the biosynthesis of important metabolites, and provide genetic background for metabolic engineering. Complete genome of A. pullulans CCTCC M2012223 was sequenced by Illumina HiSeq high throughput sequencing platform. Then, fragment assembly, gene prediction, functional annotation, and GO/COG cluster were analyzed in comparison with those of other five A. pullulans varieties. The complete genome sequence of A. pullulans CCTCC M2012223 was 30756831 bp with an average GC content of 47.49%, and 9452 genes were successfully predicted. Genome-wide analysis showed that A. pullulans CCTCC M2012223 had the biggest genome assembly size. Protein sequences involved in the pullulan and polymalic acid pathway were highly conservative in all of six A. pullulans varieties. Although both A. pullulans CCTCC M2012223 and A. pullulans var. melanogenum have a close affinity, some point mutation and inserts were occurred in protein sequences involved in melanin biosynthesis. Genome information of A. pullulans CCTCC M2012223 was annotated and genes involved in melanin, pullulan and polymalic acid pathway were compared, which would provide a theoretical basis for genetic modification of metabolic pathway in A. pullulans.

  15. Method and apparatus for biological sequence comparison

    Science.gov (United States)

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  16. Universality of projectile fragmentation model

    International Nuclear Information System (INIS)

    Chaudhuri, G.; Mallik, S.; Das Gupta, S.

    2012-01-01

    Presently projectile fragmentation reaction is an important area of research as it is used for the production of radioactive ion beams. In this work, the recently developed projectile fragmentation model with an universal temperature profile is used for studying the charge distributions of different projectile fragmentation reactions with different projectile target combinations at different incident energies. The model for projectile fragmentation consists of three stages: (i) abrasion, (ii) multifragmentation and (iii) evaporation

  17. Fine de novo sequencing of a fungal genome using only SOLiD short read data: verification on Aspergillus oryzae RIB40.

    Directory of Open Access Journals (Sweden)

    Myco Umemura

    Full Text Available The development of next-generation sequencing (NGS technologies has dramatically increased the throughput, speed, and efficiency of genome sequencing. The short read data generated from NGS platforms, such as SOLiD and Illumina, are quite useful for mapping analysis. However, the SOLiD read data with lengths of <60 bp have been considered to be too short for de novo genome sequencing. Here, to investigate whether de novo sequencing of fungal genomes is possible using only SOLiD short read sequence data, we performed de novo assembly of the Aspergillus oryzae RIB40 genome using only SOLiD read data of 50 bp generated from mate-paired libraries with 2.8- or 1.9-kb insert sizes. The assembled scaffolds showed an N50 value of 1.6 Mb, a 22-fold increase than those obtained using only SOLiD short read in other published reports. In addition, almost 99% of the reference genome was accurately aligned by the assembled scaffold fragments in long lengths. The sequences of secondary metabolite biosynthetic genes and clusters, whose products are of considerable interest in fungal studies due to their potential medicinal, agricultural, and cosmetic properties, were also highly reconstructed in the assembled scaffolds. Based on these findings, we concluded that de novo genome sequencing using only SOLiD short reads is feasible and practical for molecular biological study of fungi. We also investigated the effect of filtering low quality data, library insert size, and k-mer size on the assembly performance, and recommend for the assembly use of mild filtered read data where the N50 was not so degraded and the library has an insert size of ∼2.0 kb, and k-mer size 33.

  18. Reassignment of the land tortoise haemogregarine Haemogregarina fitzsimonsi Dias 1953 (Adeleorina: Haemogregarinidae) to the genus Hepatozoon Miller 1908 (Adeleorina: Hepatozoidae) based on parasite morphology, life cycle and phylogenetic analysis of 18S rDNA sequence fragments.

    Science.gov (United States)

    Cook, Courtney A; Lawton, Scott P; Davies, Angela J; Smit, Nico J

    2014-06-13

    SUMMARY Research was undertaken to clarify the true taxonomic position of the terrestrial tortoise apicomplexan, Haemogregarina fitzsimonsi (Dias, 1953). Thin blood films were screened from 275 wild and captive South African tortoises of 6 genera and 10 species between 2009-2011. Apicomplexan parasites within films were identified, with a focus on H. fitzsimonsi. Ticks from wild tortoises, especially Amblyomma sylvaticum and Amblyomma marmoreum were also screened, and sporogonic stages were identified on dissection of adult ticks of both species taken from H. fitzsimonsi infected and apparently non-infected tortoises. Parasite DNA was extracted from fixed, Giemsa-stained tortoise blood films and from both fresh and fixed ticks, and PCR was undertaken with two primer sets, HEMO1/HEMO2, and HepF300/HepR900, to amplify parasite 18S rDNA. Results indicated that apicomplexan DNA extracted from tortoise blood films and both species of tick had been amplified by one or both primer sets. Haemogregarina  fitzsimonsi 18S rDNA sequences from tortoise blood aligned with those of species of Hepatozoon, rather than those of species of Haemogregarina or Hemolivia. It is recommended therefore that this haemogregarine be re-assigned to the genus Hepatozoon, making Hepatozoon fitzsimonsi (Dias, 1953) the only Hepatozoon known currently from any terrestrial chelonian. Ticks are its likely vectors.

  19. Improvement of methods for large scale sequencing; application to human Xq28

    Energy Technology Data Exchange (ETDEWEB)

    Gibbs, R.A.; Andersson, B.; Wentland, M.A. [Baylor College of Medicine, Houston, TX (United States)] [and others

    1994-09-01

    Sequencing of a one-metabase region of Xq28, spanning the FRAXA and IDS loci has been undertaken in order to investigate the practicality of the shotgun approach for large scale sequencing and as a platform to develop improved methods. The efficiency of several steps in the shotgun sequencing strategy has been increased using PCR-based approaches. An improved method for preparation of M13 libraries has been developed. This protocol combines a previously described adaptor-based protocol with the uracil DNA glycosylase (UDG)-cloning procedure. The efficiency of this procedure has been found to be up to 100-fold higher than that of previously used protocols. In addition the novel protocol is more reliable and thus easy to establish in a laboratory. The method has also been adapted for the simultaneous shotgun sequencing of multiple short fragments by concentrating them before library construction is presented. This protocol is suitable for rapid characterization of cDNA clones. A library was constructed from 15 PCR-amplified and concentrated human cDNA inserts, and the insert sequences could easily be identified as separate contigs during the assembly process and the sequence coverage was even along each fragment. Using this strategy, the fine structures of the FraxA and IDS loci have been revealed and several EST homologies indicating novel expressed sequences have been identified. Use of PCR to close repetitive regions that are difficult to clone was tested by determination of the sequence of a cosmid mapping DXS455 in Xq28, containing a polymorphic VNTR. The region containing the VNTR was not represented in the shotgun library, but by designing PCR primers in the sequences flanking the gap and by cloning and sequencing the PCR product, the fine structure of the VNTR has been determined. It was found to be an AT-rich VNTR with a repeated 25-mer at the center.

  20. Virtual fragment preparation for computational fragment-based drug design.

    Science.gov (United States)

    Ludington, Jennifer L

    2015-01-01

    Fragment-based drug design (FBDD) has become an important component of the drug discovery process. The use of fragments can accelerate both the search for a hit molecule and the development of that hit into a lead molecule for clinical testing. In addition to experimental methodologies for FBDD such as NMR and X-ray Crystallography screens, computational techniques are playing an increasingly important role. The success of the computational simulations is due in large part to how the database of virtual fragments is prepared. In order to prepare the fragments appropriately it is necessary to understand how FBDD differs from other approaches and the issues inherent in building up molecules from smaller fragment pieces. The ultimate goal of these calculations is to link two or more simulated fragments into a molecule that has an experimental binding affinity consistent with the additive predicted binding affinities of the virtual fragments. Computationally predicting binding affinities is a complex process, with many opportunities for introducing error. Therefore, care should be taken with the fragment preparation procedure to avoid introducing additional inaccuracies.This chapter is focused on the preparation process used to create a virtual fragment database. Several key issues of fragment preparation which affect the accuracy of binding affinity predictions are discussed. The first issue is the selection of the two-dimensional atomic structure of the virtual fragment. Although the particular usage of the fragment can affect this choice (i.e., whether the fragment will be used for calibration, binding site characterization, hit identification, or lead optimization), general factors such as synthetic accessibility, size, and flexibility are major considerations in selecting the 2D structure. Other aspects of preparing the virtual fragments for simulation are the generation of three-dimensional conformations and the assignment of the associated atomic point charges.

  1. Metagenome Fragment Classification Using -Mer Frequency Profiles

    Directory of Open Access Journals (Sweden)

    Gail Rosen

    2008-01-01

    Full Text Available A vast amount of microbial sequencing data is being generated through large-scale projects in ecology, agriculture, and human health. Efficient high-throughput methods are needed to analyze the mass amounts of metagenomic data, all DNA present in an environmental sample. A major obstacle in metagenomics is the inability to obtain accuracy using technology that yields short reads. We construct the unique -mer frequency profiles of 635 microbial genomes publicly available as of February 2008. These profiles are used to train a naive Bayes classifier (NBC that can be used to identify the genome of any fragment. We show that our method is comparable to BLAST for small 25 bp fragments but does not have the ambiguity of BLAST's tied top scores. We demonstrate that this approach is scalable to identify any fragment from hundreds of genomes. It also performs quite well at the strain, species, and genera levels and achieves strain resolution despite classifying ubiquitous genomic fragments (gene and nongene regions. Cross-validation analysis demonstrates that species-accuracy achieves 90% for highly-represented species containing an average of 8 strains. We demonstrate that such a tool can be used on the Sargasso Sea dataset, and our analysis shows that NBC can be further enhanced.

  2. Optimal production planning for PCB assembly

    CERN Document Server

    Ho, William

    2006-01-01

    Focuses on the optimization of the Printed circuit board (PCB) assembly lines' efficiency. This book integrates the component sequencing and the feeder arrangement problems together for the pick-and-place machine and the chip shooter machines.

  3. Assembling draft genomes using contiBAIT

    OpenAIRE

    O'Neill, Kieran; Hills, Mark; Gottlieb, Mike; Borkowski, Matthew; Karsan, Aly; Lansdorp, Peter M.

    2017-01-01

    A Summary: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that ar...

  4. Scaling and universality in binary fragmenting with inhibition

    International Nuclear Information System (INIS)

    Ploszajczak, M.; Botet, R.

    1994-01-01

    We investigate a new model of binary fragmentation with inhibition, driven by the white noise. In a broad range of fragmentation probabilities, the power-law spatio-temporal correlations ar found to arise due to self-organized criticality (SOC). We find in the SOC phase a non-trivial power spectrum of the temporal sequence of the fragmentation events. The 1/∫ behaviour is recovered in the irreversible, near-equilibrium part of this phase. (authors). 13 refs., 3 figs., 1 tab

  5. Scaling and universality in binary fragmenting with inhibition

    Energy Technology Data Exchange (ETDEWEB)

    Ploszajczak, M [Grand Accelerateur National d` Ions Lourds (GANIL), 14 - Caen (France); Botet, R [Paris-11 Univ., 91 - Orsay (France). Lab. de Physique des Solides

    1994-12-31

    We investigate a new model of binary fragmentation with inhibition, driven by the white noise. In a broad range of fragmentation probabilities, the power-law spatio-temporal correlations ar found to arise due to self-organized criticality (SOC). We find in the SOC phase a non-trivial power spectrum of the temporal sequence of the fragmentation events. The 1/{integral} behaviour is recovered in the irreversible, near-equilibrium part of this phase. (authors). 13 refs., 3 figs., 1 tab.

  6. Fuel assembly

    International Nuclear Information System (INIS)

    Yokota, Tokunobu.

    1990-01-01

    A fuel assembly used in a FBR type nuclear reactor comprises a plurality of fuel rods and a moderator guide member (water rod). A moderator exit opening/closing mechanism is formed at the upper portion of the moderator guide member for opening and closing a moderator exit. In the initial fuel charging operation cycle to the reactor, the moderator exit is closed by the moderator exit opening/closing mechanism. Then, voids are accumulated at the inner upper portion of the moderator guide member to harden spectrum and a great amount of plutonium is generated and accumulated in the fuel assembly. Further, in the fuel re-charging operation cycle, the moderator guide member is used having the moderator exit opened. In this case, voids are discharged from the moderator guide member to decrease the ratio, and the plutonium accumulated in the initial charging operation cycle is burnt. In this way, the fuel economy can be improved. (I.N.)

  7. Fuel assemblies

    International Nuclear Information System (INIS)

    Echigoya, Hironori; Nomata, Terumitsu.

    1983-01-01

    Purpose: To render the axial distribution relatively flat. Constitution: First nuclear element comprises a fuel can made of zircalloy i.e., the metal with less neutron absorption, which is filled with a plurality of UO 2 pellets and sealed by using a lower end plug, a plenum spring and an upper end plug by means of welding. Second fuel element is formed by substituting a part of the UO 2 pellets with a water tube which is sealed with water and has a space for allowing the heat expansion. The nuclear fuel assembly is constituted by using the first and second fuel elements together. In such a structure, since water reflects neutrons and decrease their leakage to increase the temperature, reactivity is added at the upper portion of the fuel assembly to thereby flatten the axial power distribution. Accordingly, stable operation is possible only by means of deep control rods while requiring no shallow control rods. (Sekiya, K.)

  8. Fuel assembly

    International Nuclear Information System (INIS)

    Kawai, Mitsuo.

    1988-01-01

    Purpose: To reduce the corrosion rate and suppress the increase of radioactive corrosion products in reactor water of nuclear fuel assemblies for use in BWR type reactors having spacer springs made of nickel based deposition reinforced type alloys. Constitution: Spacer rings made of nickel based deposition reinforced type alloy are incorporated and used as fuel assemblies after applying treatment of dipping and maintaining at high temperature water followed by heating in steams. Since this can remove the nickel leaching into reactor water at the initial stage, Co-58 as the radioactive corrosion products in the reactor water can be reduced, and the operation at in-service inspection or repairement can be facilitated to improve the working efficiency of the nuclear power plant. The dipping time is desirably more than 10 hours and more desirably more than 30 hours. (Horiuchi, T. )

  9. Fuel assembly

    International Nuclear Information System (INIS)

    Watanabe, Shoichi; Hirano, Yasushi.

    1998-01-01

    A one-half or more of entire fuel rods in a fuel assembly comprises MOX fuel rods containing less than 1wt% of burnable poisons, and at least a portion of the burnable poisons comprises gadolinium. Then, surplus reactivity at an initial stage of operation cycle is controlled to eliminate burnable poisons remained unburnt at a final stage, as well as increase thermal reactivity. In addition, the content of fission plutonium is determined to greater than the content of uranium 235, and fuel rods at corner portions are made not to incorporate burnable poisons. Fuel rods not containing burnable poisons are disposed at positions in adjacent with fuel rods facing to a water rod at one or two directions. Local power at radial center of the fuel assembly is increased to flatten the distortion of radial power distribution. (N.H.)

  10. An Archeology of Fragments

    Directory of Open Access Journals (Sweden)

    Gerald L. Bruns

    2014-10-01

    Full Text Available This is a short (fragmentary history of fragmentary writing from the German Romantics (F. W. Schlegel, Friedrich Hölderlin to modern and contemporary concrete or visual poetry. Such writing is (often deliberately a critique of the logic of subsumption that tries to assimilate whatever is singular and irreducible into totalities of various categorical or systematic sorts. Arguably, the fragment (parataxis is the distinctive feature of literary Modernism, which is a rejection, not of what precedes it, but of what Max Weber called “the rationalization of the world” (or Modernity whose aim is to keep everything, including all that is written, under surveillance and control.

  11. OCCURRENCE OF SMALL HOMOLOGOUS AND COMPLEMENTARY FRAGMENTS IN HUMAN VIRUS GENOMES AND THEIR POSSIBLE ROLE

    Directory of Open Access Journals (Sweden)

    E. P. Kharchenko

    2017-01-01

    Full Text Available With computer analysis occurrence of small homologous and complementary fragments (21 nucleotides in length has been studied in genomes of 14 human viruses causing most dangerous infections. The sample includes viruses with (+ and (– single stranded RNA and DNA-containing hepatitis A virus. Analysis of occurrence of homologous sequences has shown the existence two extreme situations. On the one hand, the same virus contains homologous sequences to almost all other viruses (for example, Ebola virus, severe acute respiratory syndrome-related coronavirus, and mumps virus, and numerous homologous sequences to the same other virus (especially in severe acute respiratory syndrome-related coronavirus to Dengue virus and in Ebola virus to poliovirus. On the other hand, there are rare occurrence and not numerous homologous sequences in genomes of other viruses (rubella virus, hepatitis A virus, and hepatitis B virus. Similar situation exists for occurrence of complementary sequences. Rubella virus, the genome of which has the high content of guanine and cytosine, has no complementary sequences to almost all other viruses. Most viruses have moderate level of occurrence for homologous and complementary sequences. Autocomplementary sequences are numerous in most viruses and one may suggest that the genome of single stranded RNA viruses has branched secondary structure. In addition to possible role in recombination among strains autocomplementary sequences could be regulators of translation rate of virus proteins and determine its optimal proportion in virion assembly with genome and mRNA folding. Occurrence of small homologous and complementary sequences in RNA- and DNA-containing viruses may be the result of multiple recombinations in the past and the present and determine their adaptation and variability. Recombination may take place in coinfection of human and/or common hosts. Inclusion of homologous and complementary sequences into genome could not

  12. General Assembly

    CERN Multimedia

    Staff Association

    2016-01-01

    5th April, 2016 – Ordinary General Assembly of the Staff Association! In the first semester of each year, the Staff Association (SA) invites its members to attend and participate in the Ordinary General Assembly (OGA). This year the OGA will be held on Tuesday, April 5th 2016 from 11:00 to 12:00 in BE Auditorium, Meyrin (6-2-024). During the Ordinary General Assembly, the activity and financial reports of the SA are presented and submitted for approval to the members. This is the occasion to get a global view on the activities of the SA, its financial management, and an opportunity to express one’s opinion, including taking part in the votes. Other points are listed on the agenda, as proposed by the Staff Council. Who can vote? Only “ordinary” members (MPE) of the SA can vote. Associated members (MPA) of the SA and/or affiliated pensioners have a right to vote on those topics that are of direct interest to them. Who can give his/her opinion? The Ordinary General Asse...

  13. Fuel assembly

    International Nuclear Information System (INIS)

    Ueda, Sei; Ando, Ryohei; Mitsutake, Toru.

    1995-01-01

    The present invention concerns a fuel assembly suitable to a BWR-type reactor and improved especially with the nuclear characteristic, heat performance, hydraulic performance, dismantling or assembling performance and economical property. A part of poison rods are formed as a large-diameter/multi-region poison rods having a larger diameter than a fuel rod. A large number of fuel rods are disposed surrounding a large diameter water rod and a group of the large-diameter/multi-region poison rods in adjacent with the water rod. The large-diameter water rod has a burnable poison at the tube wall portion. At least a portion of the large-diameter poison rods has a coolant circulation portion allowing coolants to circulate therethrough. Since the large-diameter poison rods are disposed at a position of high neutron fluxes, a large neutron multiplication factor suppression effect can be provided, thereby enabling to reduce the number of burnable poison rods relative to fuels. As a result, power peaking in the fuel assembly is moderated and a greater amount of plutonium can be loaded. In addition the flow of cooling water which tends to gather around the large diameter water rod can be controlled to improve cooling performance of fuels. (N.H.)

  14. Fragmentation processes in nuclear reactions

    International Nuclear Information System (INIS)

    Legrain, R.

    1984-08-01

    Projectile and nuclear fragmentation are defined and processes referred to are recalled. The two different aspects of fragmentation are considered but the emphasis is also put on heavy ion induced reactions. The preliminary results of an experiment performed at GANIL to study peripheral heavy ions induced reactions at intermediate energy are presented. The results of this experiment will illustrate the characteristics of projectile fragmentation and this will also give the opportunity to study projectile fragmentation in the transition region. Then nuclear fragmentation is considered which is associated with more central collisions in the case of heavy ion induced reactions. This aspect of fragmentation is also ilustrated with two heavy ion experiments in which fragments emitted at large angle have been observed

  15. Whole Genome Sequencing of Enterovirus species C Isolates by High-throughput Sequencing: Development of Generic Primers

    Directory of Open Access Journals (Sweden)

    Maël Bessaud

    2016-08-01

    Full Text Available Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C consists of more than 20 types, among which the 3 serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions.A simple method was developed to sequence quickly the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to be sequenced by high-throughput technique.The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures.By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses.

  16. Fragmentation of random trees

    International Nuclear Information System (INIS)

    Kalay, Z; Ben-Naim, E

    2015-01-01

    We study fragmentation of a random recursive tree into a forest by repeated removal of nodes. The initial tree consists of N nodes and it is generated by sequential addition of nodes with each new node attaching to a randomly-selected existing node. As nodes are removed from the tree, one at a time, the tree dissolves into an ensemble of separate trees, namely, a forest. We study statistical properties of trees and nodes in this heterogeneous forest, and find that the fraction of remaining nodes m characterizes the system in the limit N→∞. We obtain analytically the size density ϕ s of trees of size s. The size density has power-law tail ϕ s ∼s −α with exponent α=1+(1/m). Therefore, the tail becomes steeper as further nodes are removed, and the fragmentation process is unusual in that exponent α increases continuously with time. We also extend our analysis to the case where nodes are added as well as removed, and obtain the asymptotic size density for growing trees. (paper)

  17. Comparing de novo assemblers for 454 transcriptome data.

    Science.gov (United States)

    Kumar, Sujai; Blaxter, Mark L

    2010-10-16

    Roche 454 pyrosequencing has become a method of choice for generating transcriptome data from non-model organisms. Once the tens to hundreds of thousands of short (250-450 base) reads have been produced, it is important to correctly assemble these to estimate the sequence of all the transcripts. Most transcriptome assembly projects use only one program for assembling 454 pyrosequencing reads, but there is no evidence that the programs used to date are optimal. We have carried out a systematic comparison of five assemblers (CAP3, MIRA, Newbler, SeqMan and CLC) to establish best practices for transcriptome assemblies, using a new dataset from the parasitic nematode Litomosoides sigmodontis. Although no single assembler performed best on all our criteria, Newbler 2.5 gave longer contigs, better alignments to some reference sequences, and was fast and easy to use. SeqMan assemblies performed best on the criterion of recapitulating known transcripts, and had more novel sequence than the other assemblers, but generated an excess of small, redundant contigs. The remaining assemblers all performed almost as well, with the exception of Newbler 2.3 (the version currently used by most assembly projects), which generated assemblies that had significantly lower total length. As different assemblers use different underlying algorithms to generate contigs, we also explored merging of assemblies and found that the merged datasets not only aligned better to reference sequences than individual assemblies, but were also more consistent in the number and size of contigs. Transcriptome assemblies are smaller than genome assemblies and thus should be more computationally tractable, but are often harder because individual contigs can have highly variable read coverage. Comparing single assemblers, Newbler 2.5 performed best on our trial data set, but other assemblers were closely comparable. Combining differently optimal assemblies from different programs however gave a more credible

  18. Comparing de novo assemblers for 454 transcriptome data

    Directory of Open Access Journals (Sweden)

    Blaxter Mark L

    2010-10-01

    Full Text Available Abstract Background Roche 454 pyrosequencing has become a method of choice for generating transcriptome data from non-model organisms. Once the tens to hundreds of thousands of short (250-450 base reads have been produced, it is important to correctly assemble these to estimate the sequence of all the transcripts. Most transcriptome assembly projects use only one program for assembling 454 pyrosequencing reads, but there is no evidence that the programs used to date are optimal. We have carried out a systematic comparison of five assemblers (CAP3, MIRA, Newbler, SeqMan and CLC to establish best practices for transcriptome assemblies, using a new dataset from the parasitic nematode Litomosoides sigmodontis. Results Although no single assembler performed best on all our criteria, Newbler 2.5 gave longer contigs, better alignments to some reference sequences, and was fast and easy to use. SeqMan assemblies performed best on the criterion of recapitulating known transcripts, and had more novel sequence than the other assemblers, but generated an excess of small, redundant contigs. The remaining assemblers all performed almost as well, with the exception of Newbler 2.3 (the version currently used by most assembly projects, which generated assemblies that had significantly lower total length. As different assemblers use different underlying algorithms to generate contigs, we also explored merging of assemblies and found that the merged datasets not only aligned better to reference sequences than individual assemblies, but were also more consistent in the number and size of contigs. Conclusions Transcriptome assemblies are smaller than genome assemblies and thus should be more computationally tractable, but are often harder because individual contigs can have highly variable read coverage. Comparing single assemblers, Newbler 2.5 performed best on our trial data set, but other assemblers were closely comparable. Combining differently optimal assemblies

  19. Optimizing Transcriptome Assemblies for Eleusine indica Leaf and Seedling by Combining Multiple Assemblies from Three De Novo Assemblers

    Directory of Open Access Journals (Sweden)

    Shu Chen

    2015-03-01

    Full Text Available Due to rapid advances in sequencing technology, increasing amounts of genomic and transcriptomic data are available for plant species, presenting enormous challenges for biocomputing analysis. A crucial first step for a successful transcriptomics-based study is the building of a high-quality assembly. Here, we utilized three different de novo assemblers (Trinity, Velvet, and CLC and the EvidentialGene pipeline tr2aacds to assemble two optimized transcript sets for the notorious weed species, . Two RNA sequencing (RNA-seq datasets from leaf and aboveground seedlings were processed using three assemblers, which resulted in 20 assemblies for each dataset. The contig numbers and N50 values of each assembly were compared to study the effect of read number, k-mer size, and in silico normalization on assembly output. The 20 assemblies were then processed through the tr2aacds pipeline to remove redundant transcripts and to select the transcript set with the best coding potential. Each assembly contributed a considerable proportion to the final transcript combination with the exception of the CLC-k14. Thus each assembler and parameter set did assemble better contigs for certain transcripts. The redundancy, total contig number, N50, fully assembled contig number, and transcripts related to target-site herbicide resistance were evaluated for the EvidentialGene and Trinity assemblies. Comparing the EvidentialGene set with the Trinity assembly revealed improved quality and reduced redundancy in both leaf and seedling EvidentialGene sets. The optimized transcriptome references will be useful for studying herbicide resistance in and the evolutionary process in the three allotetraploid offspring.

  20. Fragment-based lead generation: identification of seed fragments by a highly efficient fragment screening technology

    Science.gov (United States)

    Neumann, Lars; Ritscher, Allegra; Müller, Gerhard; Hafenbradl, Doris

    2009-08-01

    For the detection of the precise and unambiguous binding of fragments to a specific binding site on the target protein, we have developed a novel reporter displacement binding assay technology. The application of this technology for the fragment screening as well as the fragment evolution process with a specific modelling based design strategy is demonstrated for inhibitors of the protein kinase p38alpha. In a fragment screening approach seed fragments were identified which were then used to build compounds from the deep-pocket towards the hinge binding area of the protein kinase p38alpha based on a modelling approach. BIRB796 was used as a blueprint for the alignment of the fragments. The fragment evolution of these deep-pocket binding fragments towards the fully optimized inhibitor BIRB796 included the modulation of the residence time as well as the affinity. The goal of our study was to evaluate the robustness and efficiency of our novel fragment screening technology at high fragment concentrations, compare the screening data with biochemical activity data and to demonstrate the evolution of the hit fragments with fast kinetics, into slow kinetic inhibitors in an in silico approach.

  1. BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data

    Science.gov (United States)

    Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Pareja, Eduardo; Tobes, Raquel

    2012-01-01

    BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. PMID:23185310

  2. BG7: a new approach for bacterial genome annotation designed for next generation sequencing data.

    Directory of Open Access Journals (Sweden)

    Pablo Pareja-Tobes

    Full Text Available BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version - which is developed in Java, takes advantage of Amazon Web Services (AWS cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future.

  3. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    Energy Technology Data Exchange (ETDEWEB)

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  4. De Novo Assembly of the Donkey White Blood Cell Transcriptome and a Comparative Analysis of Phenotype-Associated Genes between Donkeys and Horses.

    Science.gov (United States)

    Xie, Feng-Yun; Feng, Yu-Long; Wang, Hong-Hui; Ma, Yun-Feng; Yang, Yang; Wang, Yin-Chao; Shen, Wei; Pan, Qing-Jie; Yin, Shen; Sun, Yu-Jiang; Ma, Jun-Yu

    2015-01-01

    Prior to the mechanization of agriculture and labor-intensive tasks, humans used donkeys (Equus africanus asinus) for farm work and packing. However, as mechanization increased, donkeys have been increasingly raised for meat, milk, and fur in China. To maintain the development of the donkey industry, breeding programs should focus on traits related to these new uses. Compared to conventional marker-assisted breeding plans, genome- and transcriptome-based selection methods are more efficient and effective. To analyze the coding genes of the donkey genome, we assembled the transcriptome of donkey white blood cells de novo. Using transcriptomic deep-sequencing data, we identified 264,714 distinct donkey unigenes and predicted 38,949 protein fragments. We annotated the donkey unigenes by BLAST searches against the non-redundant (NR) protein database. We also compared the donkey protein sequences with those of the horse (E. caballus) and wild horse (E. przewalskii), and linked the donkey protein fragments with mammalian phenotypes. As the outer ear size of donkeys and horses are obviously different, we compared the outer ear size-associated proteins in donkeys and horses. We identified three ear size-associated proteins, HIC1, PRKRA, and KMT2A, with sequence differences among the donkey, horse, and wild horse loci. Since the donkey genome sequence has not been released, the de novo assembled donkey transcriptome is helpful for preliminary investigations of donkey cultivars and for genetic improvement.

  5. De Novo Assembly of the Donkey White Blood Cell Transcriptome and a Comparative Analysis of Phenotype-Associated Genes between Donkeys and Horses.

    Directory of Open Access Journals (Sweden)

    Feng-Yun Xie

    Full Text Available Prior to the mechanization of agriculture and labor-intensive tasks, humans used donkeys (Equus africanus asinus for farm work and packing. However, as mechanization increased, donkeys have been increasingly raised for meat, milk, and fur in China. To maintain the development of the donkey industry, breeding programs should focus on traits related to these new uses. Compared to conventional marker-assisted breeding plans, genome- and transcriptome-based selection methods are more efficient and effective. To analyze the coding genes of the donkey genome, we assembled the transcriptome of donkey white blood cells de novo. Using transcriptomic deep-sequencing data, we identified 264,714 distinct donkey unigenes and predicted 38,949 protein fragments. We annotated the donkey unigenes by BLAST searches against the non-redundant (NR protein database. We also compared the donkey protein sequences with those of the horse (E. caballus and wild horse (E. przewalskii, and linked the donkey protein fragments with mammalian phenotypes. As the outer ear size of donkeys and horses are obviously different, we compared the outer ear size-associated proteins in donkeys and horses. We identified three ear size-associated proteins, HIC1, PRKRA, and KMT2A, with sequence differences among the donkey, horse, and wild horse loci. Since the donkey genome sequence has not been released, the de novo assembled donkey transcriptome is helpful for preliminary investigations of donkey cultivars and for genetic improvement.

  6. Metrology for ITER Assembly

    International Nuclear Information System (INIS)

    Bogusch, E.

    2006-01-01

    The overall dimensions of the ITER Tokamak and the particular assembly sequence preclude the use of conventional optical metrology, mechanical jigs and traditional dimensional control equipment, as used for the assembly of smaller, previous generation, fusion devices. This paper describes the state of the art of the capabilities of available metrology systems, with reference to the previous experience in Fusion engineering and in other industries. Two complementary procedures of transferring datum from the primary datum network on the bioshield to the secondary datum s inside the VV with the desired accuracy of about 0.1 mm is described, one method using the access directly through the ports and the other using transfer techniques, developed during the co-operation with ITER/EFDA. Another important task described is the development of a method for the rapid and easy measurement of the gaps between sectors, required for the production of the customised splice plates between them. The scope of the paper includes the evaluation of the composition and cost of the systems and team of technical staff required to meet the requirements of the assembly procedure. The results from a practical, full-scale demonstration of the methodologies used, using the proposed equipment, is described. This work has demonstrated the feasibility of achieving the necessary accuracies for the successful building of ITER. (author)

  7. Fuel assembly

    International Nuclear Information System (INIS)

    Fujibayashi, Toru.

    1970-01-01

    Herein disclosed is a fuel assembly in which a fuel rod bundle is easily detachable by rotating a fuel rod fastener rotatably mounted to the upper surface of an upper tie-plate supporting a fuel bundle therebelow. A locking portion at the leading end of each fuel rod protrudes through the upper tie-plate and is engaged with or separated from the tie-plate by the rotation of the fastener. The removal of a desired fuel rod can therefore be remotely accomplished without the necessity of handling pawls, locking washers and nuts. (Owens, K.J.)

  8. Assembling consumption

    DEFF Research Database (Denmark)

    Assembling Consumption marks a definitive step in the institutionalisation of qualitative business research. By gathering leading scholars and educators who study markets, marketing and consumption through the lenses of philosophy, sociology and anthropology, this book clarifies and applies...... the investigative tools offered by assemblage theory, actor-network theory and non-representational theory. Clear theoretical explanation and methodological innovation, alongside empirical applications of these emerging frameworks will offer readers new and refreshing perspectives on consumer culture and market...... societies. This is an essential reading for both seasoned scholars and advanced students of markets, economies and social forms of consumption....

  9. Assembly of α-synuclein fibrils in nanoscale studied by peptide truncation and AFM

    International Nuclear Information System (INIS)

    Zhang Feng; Lin Xiaojing; Ji Lina; Du Haining; Tang Lin; He Jianhua; Hu Jun; Hu Hongyu

    2008-01-01

    α-Synuclein (α-Syn) fibrils are the major component of Lewy bodies that are closely associated with the pathogenesis of Parkinson's disease, but the mechanism for the fibril assembly remains poorly understood. Here we report using a combination of peptide truncation and atomic force microscopy (AFM) to elucidate the self-assembly and morphology of the α-Syn fibrils. The results show that protease K significantly slims the fibrils from the mean height of ∼6.6 to ∼4.7 nm, whereas chaotropic denaturant urea completely breaks down the fibrils into small particles. The in situ enzymatic digestion also results in thinning of the fibrils, giving rise to some nicks on the fibrils. Moreover, N- or C-terminally truncated α-Syn fragments assemble into thinner filaments with the heights depending on the peptide lengths. A nine-residue peptide corresponding to the homologous GAV-motif sequence can form very thin (∼2.2 nm) but long (>1 μm) filaments. Thus, the central sequence of α-Syn forms a fibrillar core by cross-β-structure that is flanked by two flexible termini, and the orientation of the fibril growth is perpendicular to the β-sheet structures

  10. AutoAssemblyD: a graphical user interface system for several genome assemblers.

    Science.gov (United States)

    Veras, Adonney Allan de Oliveira; de Sá, Pablo Henrique Caracciolo Gomes; Azevedo, Vasco; Silva, Artur; Ramos, Rommel Thiago Jucá

    2013-01-01

    Next-generation sequencing technologies have increased the amount of biological data generated. Thus, bioinformatics has become important because new methods and algorithms are necessary to manipulate and process such data. However, certain challenges have emerged, such as genome assembly using short reads and high-throughput platforms. In this context, several algorithms have been developed, such as Velvet, Abyss, Euler-SR, Mira, Edna, Maq, SHRiMP, Newbler, ALLPATHS, Bowtie and BWA. However, most such assemblers do not have a graphical interface, which makes their use difficult for users without computing experience given the complexity of the assembler syntax. Thus, to make the operation of such assemblers accessible to users without a computing background, we developed AutoAssemblyD, which is a graphical tool for genome assembly submission and remote management by multiple assemblers through XML templates. AssemblyD is freely available at https://sourceforge.net/projects/autoassemblyd. It requires Sun jdk 6 or higher.

  11. Fragmented medial coronoid process

    International Nuclear Information System (INIS)

    Juhasz, Cs.; Juhasz, T.

    1997-01-01

    Fragmented medial coronoid process: (FCP) is often considered to be part of the osteochondrosis dissecans complex, but trauma and growth discrepancies between the radius and ulna are proposed as causes. There is little to clinically differentiate FCP, from osteochondrosis dissecans (OCD) of the elbow. Pain on, flexion-extension of the elbow and lateral rotation of the paw is a little more consistent in FCP. Radiographic examination of the elbow is important despite the, fact that radiographic signs of the FCP are often nonspecific. Excessive osteoarthrosis and superimposition of the radial head and coronoid process make identification of the FCP difficult. Craniocaudal, flexed mediolateral and 25 degree craniocaudal-lateromedial views are necessary for diagnosis. Osteophyte production is more dramatic with FCP than with OCD and suggests therefore the occurrence of OCP in many cases. Although the detached process may be seen on any view, the oblique projection offers the least obstructed view. Exposure of the joint is identical to that for OCD, that means a medial approach with osteotomy of the epicondyle. In most cases the process is loose enough to be readily apparent, but in some it is necessary to exert force on the process in order to find the cleavage plane. It is necessary to remove the osteophytes as well and to inspect and irrigate the joint carefully to remove cartilage fragments before closure. Confinement is advisable for 4 weeks before returning the dog to normal activity. The outlook for function is good if the FCP is removed before secondary degenerative joint disease is well established

  12. Fractal statistics of brittle fragmentation

    Directory of Open Access Journals (Sweden)

    M. Davydova

    2013-04-01

    Full Text Available The study of fragmentation statistics of brittle materials that includes four types of experiments is presented. Data processing of the fragmentation of glass plates under quasi-static loading and the fragmentation of quartz cylindrical rods under dynamic loading shows that the size distribution of fragments (spatial quantity is fractal and can be described by a power law. The original experimental technique allows us to measure, apart from the spatial quantity, the temporal quantity - the size of time interval between the impulses of the light reflected from the newly created surfaces. The analysis of distributions of spatial (fragment size and temporal (time interval quantities provides evidence of obeying scaling laws, which suggests the possibility of self-organized criticality in fragmentation.

  13. Fuel assembly

    International Nuclear Information System (INIS)

    Kurihara, Kunitoshi; Azekura, Kazuo.

    1992-01-01

    In a reactor core of a heavy water moderated light water cooled pressure tube type reactor, no sufficient effects have been obtained for the transfer width to a negative side of void reactivity change in a region of a great void coefficient. Then, a moderation region divided into upper and lower two regions is disposed at the central portion of a fuel assembly. Coolants flown into the lower region can be discharged to the cooling region from an opening disposed at the upper end portion of the lower region. Light water flows from the lower region of the moderator region to the cooling region of the reactor core upper portion, to lower the void coefficient. As a result, the reactivity performance at low void coefficient, i.e., a void reaction rate is transferred to the negative side. Thus, this flattens the power distribution in the fuel assembly, increases the thermal margin and enables rapid operaiton and control of the reactor core, as well as contributes to the increase of fuel burnup ratio and reduction of the fuel cycle cost. (N.H.)

  14. Fuel assembly

    International Nuclear Information System (INIS)

    Chaki, Masao; Nishida, Koji; Karasawa, Hidetoshi; Kanazawa, Toru; Orii, Akihito; Nagayoshi, Takuji; Kashiwai, Shin-ichi; Masuhara, Yasuhiro

    1998-01-01

    The present invention concerns a fuel assembly, for a BWR type nuclear reactor, comprising fuel rods in 9 x 9 matrix. The inner width of the channel box is about 132mm and the length of the fuel rods which are not short fuel rods is about 4m. Two water rods having a circular cross section are arranged on a diagonal line in a portion of 3 x 3 matrix at the center of the fuel assembly, and two fuel rods are disposed at vacant spaces, and the number of fuel rods is 74. Eight fuel rods are determined as short fuel rods among 74 fuel rods. Assuming the fuel inventory in the short fuel rod as X(kg), and the fuel inventory in the fuel rods other than the short fuel rods as Y(kg), X and Y satisfy the relation: X + Y ≥ 173m, Y ≤ - 9.7X + 292, Y ≤ - 0.3X + 203 and X > 0. Then, even when the short fuel rods are used, the fuel inventory is increased and fuel economy can be improved. (I.N.)

  15. Fuel assembly

    International Nuclear Information System (INIS)

    Fushimi, Atsushi; Shimada, Hidemitsu; Aoyama, Motoo; Nakajima, Junjiro

    1998-01-01

    In a fuel assembly for an n x n lattice-like BWR type reactor, n is determined to 9 or greater, and the enrichment degree of plutonium is determined to 4.4% by weight or less. Alternatively, n is determined to 10 or greater, and the enrichment degree of plutonium is determined to 5.2% by weight or less. An average take-out burnup degree is determined to 39GWd/t or less, and the matrix is determined to 9 x 9 or more, or the average take-out burnup degree is determined to 51GWd/t, and the matrix is determined to 10 x 10 or more and the increase of the margin of the maximum power density obtained thereby is utilized for the compensation of the increase of distortion of power distribution due to decrease of the kinds of plutonium enrichment degree, thereby enabling to reduce the kind of the enrichment degree of MOX fuel rods to one. As a result, the manufacturing step for fuel pellets can be simplified to reduce the manufacturing cost for MOX fuel assemblies. (N.H.)

  16. General Assembly

    CERN Multimedia

    Staff Association

    2015-01-01

    Mardi 5 mai à 11 h 00 Salle 13-2-005 Conformément aux statuts de l’Association du personnel, une Assemblée générale ordinaire est organisée une fois par année (article IV.2.1). Projet d’ordre du jour : 1- Adoption de l’ordre du jour. 2- Approbation du procès-verbal de l’Assemblée générale ordinaire du 22 mai 2014. 3- Présentation et approbation du rapport d’activités 2014. 4- Présentation et approbation du rapport financier 2014. 5- Présentation et approbation du rapport des vérificateurs aux comptes pour 2014. 6- Programme 2015. 7- Présentation et approbation du projet de budget 2015 et taux de cotisation pour 2015. 8- Pas de modifications aux Statuts de l'Association du personnel proposée. 9- Élections des membres de la Commission é...

  17. General Assembly

    CERN Multimedia

    Staff Association

    2017-01-01

    Conformément aux statuts de l’Association du personnel, une Assemblée générale ordinaire est organisée une fois par année (article IV.2.1). Projet d’ordre du jour : Adoption de l’ordre du jour. Approbation du procès-verbal de l’Assemblée générale ordinaire du 5 avril 2016. Présentation et approbation du rapport d’activités 2016. Présentation et approbation du rapport financier 2016. Présentation et approbation du rapport des vérificateurs aux comptes pour 2016. Programme de travail 2017. Présentation et approbation du projet de budget 2017 Approbation du taux de cotisation pour 2018. Modifications aux Statuts de l'Association du personnel proposées. Élections des membres de la Commission électorale. Élections des vérifica...

  18. General Assembly

    CERN Multimedia

    Staff Association

    2016-01-01

    Mardi 5 avril à 11 h 00 BE Auditorium Meyrin (6-2-024) Conformément aux statuts de l’Association du personnel, une Assemblée générale ordinaire est organisée une fois par année (article IV.2.1). Projet d’ordre du jour : Adoption de l’ordre du jour. Approbation du procès-verbal de l’Assemblée générale ordinaire du 5 mai 2015. Présentation et approbation du rapport d’activités 2015. Présentation et approbation du rapport financier 2015. Présentation et approbation du rapport des vérificateurs aux comptes pour 2015. Programme de travail 2016. Présentation et approbation du projet de budget 2016 Approbation du taux de cotisation pour 2017. Modifications aux Statuts de l'Association du personnel proposée. Élections des membres de la Commissio...

  19. General assembly

    CERN Multimedia

    Staff Association

    2015-01-01

    Mardi 5 mai à 11 h 00 Salle 13-2-005 Conformément aux statuts de l’Association du personnel, une Assemblée générale ordinaire est organisée une fois par année (article IV.2.1). Projet d’ordre du jour : Adoption de l’ordre du jour. Approbation du procès-verbal de l’Assemblée générale ordinaire du 22 mai 2014. Présentation et approbation du rapport d’activités 2014. Présentation et approbation du rapport financier 2014. Présentation et approbation du rapport des vérificateurs aux comptes pour 2014. Programme 2015. Présentation et approbation du projet de budget 2015 et taux de cotisation pour 2015. Pas de modifications aux Statuts de l'Association du personnel proposée. Élections des membres de la Commission électorale. &am...

  20. Fuel assembly

    International Nuclear Information System (INIS)

    Nomata, Terumitsu.

    1993-01-01

    Among fuel pellets to be loaded to fuel cans of a fuel assembly, fuel pellets having a small thermal power are charged in a region from the end of each of spacers up to about 50mm on the upstream of coolants that flow vertically at the periphery of fuel rods. Coolants at the periphery of fuel rods are heated by the heat generation, to result in voids. However, since cooling effect on the upstream of the spacers is low due to influences of the spacers. Further, since the fuel pellets disposed in the upstream region have small thermal power, a void coefficient is not increased. Even if a thermal power exceeding cooling performance should be generated, there is no worry of causing burnout in the upstream region. Even if burnout should be caused, safety margin and reliability relative to burnout are improved, to increase an allowable thermal power, thereby enabling to improve integrity and reliability of fuel rods and fuel assemblies. (N.H.)

  1. Fluctuations in the fragmentation process

    International Nuclear Information System (INIS)

    Botet, R.; Ploszajczak, M.

    1993-01-01

    Some general framework of sequential fragmentation is presented, as provided by the newly proposed Fragmentation - Inactivation - Binary model, and to study briefly its basic and universal features. This model includes as particular cases most of the previous kinetic fragmentation models. In particular it is discussed how one arrives in this framework to the critical behaviour, called the shattering transition. This model is then compared to recent data on gold multifragmentation at 600 MeV/nucl. (authors) 20 refs., 5 figs

  2. MRI of displaced meniscal fragments

    International Nuclear Information System (INIS)

    Dunoski, Brian; Zbojniewicz, Andrew M.; Laor, Tal

    2012-01-01

    A torn meniscus frequently requires surgical fixation or debridement as definitive treatment. Meniscal tears with associated fragment displacement, such as bucket handle and flap tears, can be difficult to recognize and accurately describe on MRI, and displaced fragments can be challenging to identify at surgery. A displaced meniscal fragment can be obscured by synovium or be in a location not usually evaluated at arthroscopy. We present a pictorial essay of meniscal tears with displaced fragments in patients referred to a pediatric hospital in order to increase recognition and accurate interpretation by the radiologist, who in turn can help assist the surgeon in planning appropriate therapy. (orig.)

  3. MRI of displaced meniscal fragments

    Energy Technology Data Exchange (ETDEWEB)

    Dunoski, Brian [University of Cincinnati College of Medicine, Department of Radiology, Cincinnati Children' s Hospital Medical Center, Cincinnati, OH (United States); Children' s Hospital of Michigan, Department of Radiology, Detroit, MI (United States); Zbojniewicz, Andrew M.; Laor, Tal [University of Cincinnati College of Medicine, Department of Radiology, Cincinnati Children' s Hospital Medical Center, Cincinnati, OH (United States)

    2012-01-15

    A torn meniscus frequently requires surgical fixation or debridement as definitive treatment. Meniscal tears with associated fragment displacement, such as bucket handle and flap tears, can be difficult to recognize and accurately describe on MRI, and displaced fragments can be challenging to identify at surgery. A displaced meniscal fragment can be obscured by synovium or be in a location not usually evaluated at arthroscopy. We present a pictorial essay of meniscal tears with displaced fragments in patients referred to a pediatric hospital in order to increase recognition and accurate interpretation by the radiologist, who in turn can help assist the surgeon in planning appropriate therapy. (orig.)

  4. Modular assembly of transposable element arrays by microsatellite targeting in the guayule and rice genomes.

    Science.gov (United States)

    Valdes Franco, José A; Wang, Yi; Huo, Naxin; Ponciano, Grisel; Colvin, Howard A; McMahan, Colleen M; Gu, Yong Q; Belknap, William R

    2018-04-19

    Guayule (Parthenium argentatum A. Gray) is a rubber-producing desert shrub native to Mexico and the United States. Guayule represents an alternative to Hevea brasiliensis as a source for commercial natural rubber. The efficient application of modern molecular/genetic tools to guayule improvement requires characterization of its genome. The 1.6 Gb guayule genome was sequenced, assembled and annotated. The final 1.5 Gb assembly, while fragmented (N 50  = 22 kb), maps > 95% of the shotgun reads and is essentially complete. Approximately 40,000 transcribed, protein encoding genes were annotated on the assembly. Further characterization of this genome revealed 15 families of small, microsatellite-associated, transposable elements (TEs) with unexpected chromosomal distribution profiles. These SaTar (Satellite Targeted) elements, which are non-autonomous Mu-like elements (MULEs), were frequently observed in multimeric linear arrays of unrelated individual elements within which no individual element is interrupted by another. This uniformly non-nested TE multimer architecture has not been previously described in either eukaryotic or prokaryotic genomes. Five families of similarly distributed non-autonomous MULEs (microsatellite associated, modularly assembled) were characterized in the rice genome. Families of TEs with similar structures and distribution profiles were identified in sorghum and citrus. The sequencing and assembly of the guayule genome provides a foundation for application of current crop improvement technologies to this plant. In addition, characterization of this genome revealed SaTar elements with distribution profiles unique among TEs. Satar targeting appears based on an alternative MULE recombination mechanism with the potential to impact gene evolution.

  5. Hapsembler: An Assembler for Highly Polymorphic Genomes

    Science.gov (United States)

    Donmez, Nilgun; Brudno, Michael

    As whole genome sequencing has become a routine biological experiment, algorithms for assembly of whole genome shotgun data has become a topic of extensive research, with a plethora of off-the-shelf methods that can reconstruct the genomes of many organisms. Simultaneously, several recently sequenced genomes exhibit very high polymorphism rates. For these organisms genome assembly remains a challenge as most assemblers are unable to handle highly divergent haplotypes in a single individual. In this paper we describe Hapsembler, an assembler for highly polymorphic genomes, which makes use of paired reads. Our experiments show that Hapsembler produces accurate and contiguous assemblies of highly polymorphic genomes, while performing on par with the leading tools on haploid genomes. Hapsembler is available for download at http://compbio.cs.toronto.edu/hapsembler.

  6. V-GAP: Viral genome assembly pipeline

    KAUST Repository

    Nakamura, Yoji

    2015-10-22

    Next-generation sequencing technologies have allowed the rapid determination of the complete genomes of many organisms. Although shotgun sequences from large genome organisms are still difficult to reconstruct perfect contigs each of which represents a full chromosome, those from small genomes have been assembled successfully into a very small number of contigs. In this study, we show that shotgun reads from phage genomes can be reconstructed into a single contig by controlling the number of read sequences used in de novo assembly. We have developed a pipeline to assemble small viral genomes with good reliability using a resampling method from shotgun data. This pipeline, named V-GAP (Viral Genome Assembly Pipeline), will contribute to the rapid genome typing of viruses, which are highly divergent, and thus will meet the increasing need for viral genome comparisons in metagenomic studies.

  7. V-GAP: Viral genome assembly pipeline

    KAUST Repository

    Nakamura, Yoji; Yasuike, Motoshige; Nishiki, Issei; Iwasaki, Yuki; Fujiwara, Atushi; Kawato, Yasuhiko; Nakai, Toshihiro; Nagai, Satoshi; Kobayashi, Takanori; Gojobori, Takashi; Ototake, Mitsuru

    2015-01-01

    Next-generation sequencing technologies have allowed the rapid determination of the complete genomes of many organisms. Although shotgun sequences from large genome organisms are still difficult to reconstruct perfect contigs each of which represents a full chromosome, those from small genomes have been assembled successfully into a very small number of contigs. In this study, we show that shotgun reads from phage genomes can be reconstructed into a single contig by controlling the number of read sequences used in de novo assembly. We have developed a pipeline to assemble small viral genomes with good reliability using a resampling method from shotgun data. This pipeline, named V-GAP (Viral Genome Assembly Pipeline), will contribute to the rapid genome typing of viruses, which are highly divergent, and thus will meet the increasing need for viral genome comparisons in metagenomic studies.

  8. DNA-guided nanoparticle assemblies

    Science.gov (United States)

    Gang, Oleg; Nykypanchuk, Dmytro; Maye, Mathew; van der Lelie, Daniel

    2013-07-16

    In some embodiments, DNA-capped nanoparticles are used to define a degree of crystalline order in assemblies thereof. In some embodiments, thermodynamically reversible and stable body-centered cubic (bcc) structures, with particles occupying <.about.10% of the unit cell, are formed. Designs and pathways amenable to the crystallization of particle assemblies are identified. In some embodiments, a plasmonic crystal is provided. In some aspects, a method for controlling the properties of particle assemblages is provided. In some embodiments a catalyst is formed from nanoparticles linked by nucleic acid sequences and forming an open crystal structure with catalytically active agents attached to the crystal on its surface or in interstices.

  9. Radical probing of spliceosome assembly.

    Science.gov (United States)

    Grewal, Charnpal S; Kent, Oliver A; MacMillan, Andrew M

    2017-08-01

    Here we describe the synthesis and use of a directed hydroxyl radical probe, tethered to a pre-mRNA substrate, to map the structure of this substrate during the spliceosome assembly process. These studies indicate an early organization and proximation of conserved pre-mRNA sequences during spliceosome assembly. This methodology may be adapted to the synthesis of a wide variety of modified RNAs for use as probes of RNA structure and RNA-protein interaction. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Comparison of Direct Sequencing, Real-Time PCR-High Resolution Melt (PCR-HRM) and PCR-Restriction Fragment Length Polymorphism (PCR-RFLP) Analysis for Genotyping of Common Thiopurine Intolerant Variant Alleles NUDT15 c.415C>T and TPMT c.719A>G (TPMT*3C).

    Science.gov (United States)

    Fong, Wai-Ying; Ho, Chi-Chun; Poon, Wing-Tat

    2017-05-12

    Thiopurine intolerance and treatment-related toxicity, such as fatal myelosuppression, is related to non-function genetic variants encoding thiopurine S-methyltransferase (TPMT) and Nudix hydrolase 15 (NUDT15). Genetic testing of the common variants NUDT15:NM_018283.2:c.415C>T (Arg139Cys, dbSNP rs116855232 T allele) and TPMT: NM_000367.4:c.719A>G (TPMT*3C, dbSNP rs1142345 G allele) in East Asians including Chinese can potentially prevent treatment-related complications. Two complementary genotyping approaches, real-time PCR-high resolution melt (PCR-HRM) and PCR-restriction fragment length morphism (PCR-RFLP) analysis were evaluated using conventional PCR and Sanger sequencing genotyping as the gold standard. Sixty patient samples were tested, revealing seven patients (11.7%) heterozygous for NUDT15 c.415C>T, one patient homozygous for the variant and one patient heterozygous for the TPMT*3C non-function allele. No patient was found to harbor both variants. In total, nine out of 60 (15%) patients tested had genotypic evidence of thiopurine intolerance, which may require dosage adjustment or alternative medication should they be started on azathioprine, mercaptopurine or thioguanine. The two newly developed assays were more efficient and showed complete concordance (60/60, 100%) compared to the Sanger sequencing results. Accurate and cost-effective genotyping assays by real-time PCR-HRM and PCR-RFLP for NUDT15 c.415C>T and TPMT*3C were successfully developed. Further studies may establish their roles in genotype-informed clinical decision-making in the prevention of morbidity and mortality due to thiopurine intolerance.

  11. Genotypic Characterization of Bradyrhizobium Strains Nodulating Endemic Woody Legumes of the Canary Islands by PCR-Restriction Fragment Length Polymorphism Analysis of Genes Encoding 16S rRNA (16S rDNA) and 16S-23S rDNA Intergenic Spacers, Repetitive Extragenic Palindromic PCR Genomic Fingerprinting, and Partial 16S rDNA Sequencing

    Science.gov (United States)

    Vinuesa, Pablo; Rademaker, Jan L. W.; de Bruijn, Frans J.; Werner, Dietrich

    1998-01-01

    We present a phylogenetic analysis of nine strains of symbiotic nitrogen-fixing bacteria isolated from nodules of tagasaste (Chamaecytisus proliferus) and other endemic woody legumes of the Canary Islands, Spain. These and several reference strains were characterized genotypically at different levels of taxonomic resolution by computer-assisted analysis of 16S ribosomal DNA (rDNA) PCR-restriction fragment length polymorphisms (PCR-RFLPs), 16S-23S rDNA intergenic spacer (IGS) RFLPs, and repetitive extragenic palindromic PCR (rep-PCR) genomic fingerprints with BOX, ERIC, and REP primers. Cluster analysis of 16S rDNA restriction patterns with four tetrameric endonucleases grouped the Canarian isolates with the two reference strains, Bradyrhizobium japonicum USDA 110spc4 and Bradyrhizobium sp. strain (Centrosema) CIAT 3101, resolving three genotypes within these bradyrhizobia. In the analysis of IGS RFLPs with three enzymes, six groups were found, whereas rep-PCR fingerprinting revealed an even greater genotypic diversity, with only two of the Canarian strains having similar fingerprints. Furthermore, we show that IGS RFLPs and even very dissimilar rep-PCR fingerprints can be clustered into phylogenetically sound groupings by combining them with 16S rDNA RFLPs in computer-assisted cluster analysis of electrophoretic patterns. The DNA sequence analysis of a highly variable 264-bp segment of the 16S rRNA genes of these strains was found to be consistent with the fingerprint-based classification. Three different DNA sequences were obtained, one of which was not previously described, and all belonged to the B. japonicum/Rhodopseudomonas rDNA cluster. Nodulation assays revealed that none of the Canarian isolates nodulated Glycine max or Leucaena leucocephala, but all nodulated Acacia pendula, C. proliferus, Macroptilium atropurpureum, and Vigna unguiculata. PMID:9603820

  12. PAVE: Program for assembling and viewing ESTs

    Directory of Open Access Journals (Sweden)

    Bomhoff Matthew

    2009-08-01

    Full Text Available Abstract Background New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs. Results The PAVE (Program for Assembling and Viewing ESTs assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs. Conclusion The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.

  13. PAVE: program for assembling and viewing ESTs.

    Science.gov (United States)

    Soderlund, Carol; Johnson, Eric; Bomhoff, Matthew; Descour, Anne

    2009-08-26

    New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs. The PAVE (Program for Assembling and Viewing ESTs) assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs. The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.

  14. Fuel assembly

    International Nuclear Information System (INIS)

    Ueda, Makoto; Ogiya, Shunsuke.

    1989-01-01

    For improving the economy of a BWR type reactor by making the operation cycle longer, the fuel enrichment degree has to be increased further. However, this makes the subcriticality shallower in the upper portion of the reactor core, to bring about a possibility that the reactor shutdown becomes impossible. In the present invention, a portion of fuel rod is constituted as partial length fuel rods (P-fuel rods) in which the entire stack length in the effective portion is made shorter by reducing the concentration of fissionable materials in the axial portion. A plurality of moderator rods are disposed at least on one diagonal line of a fuel assembly and P-fuel rods are arranged at a position put between the moderator rods. This makes it possible to reactor shutdown and makes the axial power distribution satisfactory even if the fuel enrichment degree is increased. (T.M.)

  15. Fuel assembly

    International Nuclear Information System (INIS)

    Bando, Masaru.

    1993-01-01

    As neutron irradiation progresses on a fuel assembly of an FBR type reactor, a strong force is exerted to cause ruptures if the arrangement of fuel elements is not displaced, whereas the fuel elements may be brought into direct contact with each other not by way of spacers to cause burning damages if the arrangement is displaced. In the present invention, the circumference of fuel elements arranged in a normal triangle lattice is surrounded by a wrapper tube having a hexagonal cross section, wire spacers are wound therearound, and deformable spacers are distributed to optional positions for fuel elements in the wrapper tube. Interaction between the fuel elements caused by irradiation is effectively absorbed, thereby enabling to delay the occurrence of the rupture and burning damages of the elements. (N.H.)

  16. Fuel assembly

    International Nuclear Information System (INIS)

    Ueda, Makoto.

    1991-01-01

    In a fuel assembly in which spectral shift type moderator guide members are arranged, the moderator guide member has a flow channel resistance member, that provides flow resistance against the moderators, in the upstream of a moderator flowing channel, by which the ratio of removing coolants is set greater at the upstream than downstream. With such a constitution, the void distribution increasing upward in the channel box except for the portion of the moderator guide member is moderated by the increase of the area of the void region that expands downward in the guide member. Accordingly, the axial power distribution is flattened throughout the operation cycle and excess distortion is eliminated to improve the fuel integrity. (T.M.)

  17. Fuel assembly

    International Nuclear Information System (INIS)

    Wataumi, Kazutoshi; Tajiri, Hiroshi.

    1992-01-01

    In a fuel assembly of a BWR type reactor, a pellet to be loaded comprises an external layer of fissile materials containing burnable poisons and an internal layer of fissile materials not containing burnable poison. For example, there is provided a dual type pellet comprising an external layer made of UO 2 incorporated with Gd 2 O 3 at a predetermined concentration as the burnable poisons and an internal layer made of UO 2 not containing Gd 2 O 3 . The amount of the burnable poisons required for predetermined places is controlled by the thickness of the ring of the external layer. This can dissipate an unnecessary poisoning effect at the final stage of the combustion cycle. Further, since only one or a few kinds of powder mixture of the burnable poisons and the fissile materials is necessary, production and product control can be facilitated. (I.N.)

  18. Fuel assembly

    International Nuclear Information System (INIS)

    Ishibashi, Yoko; Aoyama, Motoo; Oyama, Jun-ichi.

    1995-01-01

    Burnable poison-incorporating fuel rods of a first group are disposed in a region in adjacent with a water rod having a large diameter (neutron moderator rod) disposed to the central portion of a fuel assembly. Burnable poison-incorporating fuel rods of a second group are disposed to a region other than peripheral zone in adjacent with a channel box and corners positioned at an inner zone, in adjacent with the channel box. The average concentration of burnable poisons of the burnable poison-incorporating fuel rods of the first group is made greater than that of the second group. With such a constitution, when the burnable poisons of the first group are burnt out, the burnable poisons of the second group are also burnt out at the same time. Accordingly, an amount of burnable poisons left unburnt at the final stage of the operation cycle is reduced, to improve the reactivity. This can improve the economical property. (I.N.)

  19. Fuel assemblies

    International Nuclear Information System (INIS)

    Yoshioka, Ritsuo.

    1983-01-01

    Purpose: To improve the operation performance of a BWR type reactor by improving the distribution of the uranium enrichment and the incorporation amount of burnable poisons in fuel assemblies. Constitution: The average enrichment of uranium 235 is increased in the upper portion as compared with that in the lower portion, while the incorporation amount of burnable poisons is increased in an upper portion as compared with that in the lower portion. The difference in the incorporation amount of the burnable poisons between the upper and lower portions is attained by charging two kinds of fuel rods; the ones incorporated with the burnable poisons over the entire length and the others incorporated with the burnable poisons only in the upper portions. (Seki, T.)

  20. ASSEMBLY TRANSFER SYSTEM DESCRIPTION DOCUMENT

    International Nuclear Information System (INIS)

    Gorpani, B.

    2000-01-01

    The Assembly Transfer System (ATS) receives, cools, and opens rail and truck transportation casks from the Carrier/Cask Handling System (CCHS). The system unloads transportation casks consisting of bare Spent Nuclear Fuel (SNF) assemblies, single element canisters, and Dual Purpose Canisters (DPCs). For casks containing DPCs, the system opens the DPCs and unloads the SNF. The system stages the assemblies, transfer assemblies to and from fuel-blending inventory pools, loads them into Disposal Containers (DCs), temporarily seals and inerts the DC, decontaminates the DC and transfers it to the Disposal Container Handling System. The system also prepares empty casks and DPCs for off-site shipment. Two identical Assembly Transfer System lines are provided in the Waste Handling Building (WHB). Each line operates independently to handle the waste transfer throughput and to support maintenance operations. Each system line primarily consists of wet and dry handling areas. The wet handling area includes a cask transport system, cask and DPC preparation system, and a wet assembly handling system. The basket transport system forms the transition between the wet and dry handling areas. The dry handling area includes the dry assembly handling system, assembly drying system, DC preparation system, and DC transport system. Both the wet and dry handling areas are controlled by the control and tracking system. The system operating sequence begins with moving transportation casks to the cask preparation area. The cask preparation operations consist of cask cavity gas sampling, cask venting, cask cool-down, outer lid removal, and inner shield plug lifting fixture attachment. Casks containing bare SNF (no DPC) are filled with water and placed in the cask unloading pool. The inner shield plugs are removed underwater. For casks containing a DPC, the cask lid(s) is removed, and the DPC is penetrated, sampled, vented, and cooled. A DPC lifting fixture is attached and the cask is placed

  1. A simple and accurate two-step long DNA sequences synthesis strategy to improve heterologous gene expression in pichia.

    Directory of Open Access Journals (Sweden)

    Jiang-Ke Yang

    Full Text Available In vitro gene chemical synthesis is a powerful tool to improve the expression of gene in heterologous system. In this study, a two-step gene synthesis strategy that combines an assembly PCR and an overlap extension PCR (AOE was developed. In this strategy, the chemically synthesized oligonucleotides were assembled into several 200-500 bp fragments with 20-25 bp overlap at each end by assembly PCR, and then an overlap extension PCR was conducted to assemble all these fragments into a full length DNA sequence. Using this method, we de novo designed and optimized the codon of Rhizopus oryzae lipase gene ROL (810 bp and Aspergillus niger phytase gene phyA (1404 bp. Compared with the original ROL gene and phyA gene, the codon-optimized genes expressed at a significantly higher level in yeasts after methanol induction. We believe this AOE method to be of special interest as it is simple, accurate and has no limitation with respect to the size of the gene to be synthesized. Combined with de novo design, this method allows the rapid synthesis of a gene optimized for expression in the system of choice and production of sufficient biological material for molecular characterization and biotechnological application.

  2. MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets

    DEFF Research Database (Denmark)

    Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole

    2016-01-01

    genome structure of many bacteriophages. The method is demonstrated to outperform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source...... and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e. contigs) of phage origin in metage-nomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic...... code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder....

  3. Formation and fragmentation of protostellar dense cores

    International Nuclear Information System (INIS)

    Maury, Anaelle

    2009-01-01

    Stars form in molecular clouds, when they collapse and fragment to produce protostellar dense cores. These dense cores are then likely to contract under their own gravity, and form young protostars, that further evolve while accreting their circumstellar mass, until they reach the main sequence. The main goal of this thesis was to study the formation and fragmentation of protostellar dense cores. To do so, two main studies, described in this manuscript, were carried out. First, we studied the formation of protostellar cores by quantifying the impact of protostellar outflows on clustered star formation. We carried out a study of the protostellar outflows powered by the young stellar objects currently formed in the NGc 2264-C proto-cluster, and we show that protostellar outflows seem to play a crucial role as turbulence progenitors in clustered star forming regions, although they seem unlikely to significantly modify the global infall processes at work on clump scales. Second, we investigated the formation of multiple systems by core fragmentation, by using high - resolution observations that allow to probe the multiplicity of young protostars on small scales. Our results suggest that the multiplicity rate of protostars on small scales increase while they evolve, and thus favor dynamical scenarios for the formation of multiple systems. Moreover, our results favor magnetized scenarios of core collapse to explain the small-scale properties of protostars at the earliest stages. (author) [fr

  4. Thermodynamical string fragmentation

    Energy Technology Data Exchange (ETDEWEB)

    Fischer, Nadine [Theoretical Particle Physics, Department of Astronomy and Theoretical Physics, Lund University,Sölvegatan 14A, Lund, SE-223 62 (Sweden); School of Physics and Astronomy, Monash University,Wellington Road, Clayton, VIC-3800 (Australia); Sjöstrand, Torbjörn [Theoretical Particle Physics, Department of Astronomy and Theoretical Physics, Lund University,Sölvegatan 14A, Lund, SE-223 62 (Sweden)

    2017-01-31

    The observation of heavy-ion-like behaviour in pp collisions at the LHC suggests that more physics mechanisms are at play than traditionally assumed. The introduction e.g. of quark-gluon plasma or colour rope formation can describe several of the observations, but as of yet there is no established paradigm. In this article we study a few possible modifications to the Pythia event generator, which describes a wealth of data but fails for a number of recent observations. Firstly, we present a new model for generating the transverse momentum of hadrons during the string fragmentation process, inspired by thermodynamics, where heavier hadrons naturally are suppressed in rate but obtain a higher average transverse momentum. Secondly, close-packing of strings is taken into account by making the temperature or string tension environment-dependent. Thirdly, a simple model for hadron rescattering is added. The effect of these modifications is studied, individually and taken together, and compared with data mainly from the LHC. While some improvements can be noted, it turns out to be nontrivial to obtain effects as big as required, and further work is called for.

  5. Quality Assessment of Domesticated Animal Genome Assemblies

    DEFF Research Database (Denmark)

    Seemann, Stefan E; Anthon, Christian; Palasca, Oana

    2015-01-01

    affected by the lack of genomic sequence. Herein, we quantify the quality of the genome assemblies of 20 domesticated animals and related species by assessing a range of measurable parameters, and we show that there is a positive correlation between the fraction of mappable reads from RNAseq data...... domesticated animal genomes still need to be sequenced deeper in order to produce high-quality assemblies. In the meanwhile, ironically, the extent to which RNAseq and other next-generation data is produced frequently far exceeds that of the genomic sequence. Furthermore, basic comparative analysis is often...

  6. Sequence recombination and conservation of Varroa destructor virus-1 and deformed wing virus in field collected honey bees (Apis mellifera.

    Directory of Open Access Journals (Sweden)

    Hui Wang

    Full Text Available We sequenced small (s RNAs from field collected honeybees (Apis mellifera and bumblebees (Bombuspascuorum using the Illumina technology. The sRNA reads were assembled and resulting contigs were used to search for virus homologues in GenBank. Matches with Varroadestructor virus-1 (VDV1 and Deformed wing virus (DWV genomic sequences were obtained for A. mellifera but not B. pascuorum. Further analyses suggested that the prevalent virus population was composed of VDV-1 and a chimera of 5'-DWV-VDV1-DWV-3'. The recombination junctions in the chimera genomes were confirmed by using RT-PCR, cDNA cloning and Sanger sequencing. We then focused on conserved short fragments (CSF, size > 25 nt in the virus genomes by using GenBank sequences and the deep sequencing data obtained in this study. The majority of CSF sites confirmed conservation at both between-species (GenBank sequences and within-population (dataset of this study levels. However, conserved nucleotide positions in the GenBank sequences might be variable at the within-population level. High mutation rates (Pi>10% were observed at a number of sites using the deep sequencing data, suggesting that sequence conservation might not always be maintained at the population level. Virus-host interactions and strategies for developing RNAi treatments against VDV1/DWV infections are discussed.

  7. Polymer fragmentation in extensional flow

    Energy Technology Data Exchange (ETDEWEB)

    Maroja, Armando M.; Oliveira, Fernando A.; Ciesla, Michal; Longa, Lech

    2001-06-01

    In this paper we present an analysis of fragmentation of dilute polymer solutions in extensional flow. The transition rate is investigated both from theoretical and computational approaches, where the existence of a Gaussian distribution for the breaking bonds has been controversial. We give as well an explanation for the low fragmentation frequency found in DNA experiments.

  8. An Algebra for Program Fragments

    DEFF Research Database (Denmark)

    Kristensen, Bent Bruun; Madsen, Ole Lehrmann; Møller-Pedersen, Birger

    1985-01-01

    Program fragments are described either by strings in the concrete syntax or by constructor applications in the abstract syntax. By defining conversions between these forms, both may be intermixed. Program fragments are constructed by terminal and nonterminal symbols from the grammar and by variab...

  9. Fracture mechanics model of fragmentation

    International Nuclear Information System (INIS)

    Glenn, L.A.; Gommerstadt, B.Y.; Chudnovsky, A.

    1986-01-01

    A model of the fragmentation process is developed, based on the theory of linear elastic fracture mechanics, which predicts the average fragment size as a function of strain rate and material properties. This approach permits a unification of previous results, yielding Griffith's solution in the low-strain-rate limit and Grady's solution at high strain rates

  10. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Directory of Open Access Journals (Sweden)

    Can Alkan

    2007-09-01

    Full Text Available The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  11. GapMis: a tool for pairwise sequence alignment with a single gap.

    Science.gov (United States)

    Flouri, Tomás; Frousios, Kimon; Iliopoulos, Costas S; Park, Kunsoo; Pissis, Solon P; Tischler, German

    2013-08-01

    Pairwise sequence alignment has received a new motivation due to the advent of recent patents in next-generation sequencing technologies, particularly so for the application of re-sequencing---the assembly of a genome directed by a reference sequence. After the fast alignment between a factor of the reference sequence and a high-quality fragment of a short read by a short-read alignment programme, an important problem is to find the alignment between a relatively short succeeding factor of the reference sequence and the remaining low-quality part of the read allowing a number of mismatches and the insertion of a single gap in the alignment. We present GapMis, a tool for pairwise sequence alignment with a single gap. It is based on a simple algorithm, which computes a different version of the traditional dynamic programming matrix. The presented experimental results demonstrate that GapMis is more suitable and efficient than most popular tools for this task.

  12. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Science.gov (United States)

    Alkan, Can; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk; Eichler, Evan E

    2007-09-01

    The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  13. Melt jet fragmentation and oxidation in the lower plenum

    International Nuclear Information System (INIS)

    Berthoud, G.

    2001-01-01

    During the late phases of a PWR Severe Accident, the core materials discharge into the lower plenum in which water is still present. In that case, we are then concerned by the possible occurrence of a Steam Explosion which may endanger the vessel structure and by the following cooling of the melt debris. So, we have two possible ways of vessel rupture: a mechanical one following an energetic Steam Explosion and a thermal one due to insufficient debris cooling. Both types of problems are linked with the degree of fragmentation of the core material during its penetration into the water of the lower plenum. One of the most likely mode of discharge consists in corium streams or jets. The fragmentation will build a corium-water mixture (the pre-mixing sequence) which, under certain circumstances, may undergo a fine fragmentation sequence leading to an energetic Steam Explosion (the explosion sequence). Whatever the occurrence of a Steam Explosion, the resulting debris will accumulate at the bottom of the Reactor Vessel and the cooling of such a ''debris bed'' is known to be highly dependant of the granulometry and build up of the debris bed which are linked with the previous sequence of corium fragmentation and dispersion. In CEA, the MC3D Code has been developed to deal with all these phenomena. (author)

  14. Mass spectrometry for fragment screening.

    Science.gov (United States)

    Chan, Daniel Shiu-Hin; Whitehouse, Andrew J; Coyne, Anthony G; Abell, Chris

    2017-11-08

    Fragment-based approaches in chemical biology and drug discovery have been widely adopted worldwide in both academia and industry. Fragment hits tend to interact weakly with their targets, necessitating the use of sensitive biophysical techniques to detect their binding. Common fragment screening techniques include differential scanning fluorimetry (DSF) and ligand-observed NMR. Validation and characterization of hits is usually performed using a combination of protein-observed NMR, isothermal titration calorimetry (ITC) and X-ray crystallography. In this context, MS is a relatively underutilized technique in fragment screening for drug discovery. MS-based techniques have the advantage of high sensitivity, low sample consumption and being label-free. This review highlights recent examples of the emerging use of MS-based techniques in fragment screening. © 2017 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.

  15. Fission fragment spins and spectroscopy

    International Nuclear Information System (INIS)

    Durell, J.L.

    1988-01-01

    Prompt γ-ray coincidence experiments have been carried out on γ-rays emitted from post-neutron emission fission fragments produced by the aup 19F + 197 Au and 18 O + 232 Th reactions. Decay schemes have been established for even-even nuclei ranging from 78 Se to 148 Nd. Many new states with spin up to ∼ 12h have been observed. Apart from providing a wealth of new information on the spectroscopy of neutron-rich nuclei, the data have been analyzed to determine the average spin of primary fission fragments as a function of fragment mass. The results suggest that the fragment spins are determined by the temperature and shape of the primary fragments at or near to scission

  16. Fragment-based drug design.

    Science.gov (United States)

    Feyfant, Eric; Cross, Jason B; Paris, Kevin; Tsao, Désirée H H

    2011-01-01

    Fragment-based drug design (FBDD), which is comprised of both fragment screening and the use of fragment hits to design leads, began more than 15 years ago and has been steadily gaining in popularity and utility. Its origin lies on the fact that the coverage of chemical space and the binding efficiency of hits are directly related to the size of the compounds screened. Nevertheless, FBDD still faces challenges, among them developing fragment screening libraries that ensure optimal coverage of chemical space, physical properties and chemical tractability. Fragment screening also requires sensitive assays, often biophysical in nature, to detect weak binders. In this chapter we will introduce the technologies used to address these challenges and outline the experimental advantages that make FBDD one of the most popular new hit-to-lead process.

  17. Linkage map of the fragments of herpesvirus papio DNA.

    Science.gov (United States)

    Lee, Y S; Tanaka, A; Lau, R Y; Nonoyama, M; Rabin, H

    1981-01-01

    Herpesvirus papio (HVP), an Epstein-Barr-like virus, causes lymphoblastoid disease in baboons. The physical map of HVP DNA was constructed for the fragments produced by cleavage of HVP DNA with restriction endonucleases EcoRI, HindIII, SalI, and PvuI, which produced 12, 12, 10, and 4 fragments, respectively. The total molecular size of HVP DNA was calculated as close to 110 megadaltons. The following methods were used for construction of the map; (i) fragments near the ends of HVP DNA were identified by treating viral DNA with lambda exonuclease before restriction enzyme digestion; (ii) fragments containing nucleotide sequences in common with fragments from the second enzyme digest of HVP DNA were examined by Southern blot hybridization; and (iii) the location of some fragments was determined by isolating individual fragments from agarose gels and redigesting the isolated fragments with a second restriction enzyme. Terminal heterogeneity and internal repeats were found to be unique features of HVP DNA molecule. One to five repeats of 0.8 megadaltons were found at both terminal ends. Although the repeats of both ends shared a certain degree of homology, it was not determined whether they were identical repeats. The internal repeat sequence of HVP DNA was found in the EcoRI-C region, which extended from 8.4 to 23 megadaltons from the left end of the molecule. The average number of the repeats was calculated to be seven, and the molecular size was determined to be 1.8 megadaltons. Similar unique features have been reported in EBV DNA (D. Given and E. Kieff, J. Virol. 28:524-542, 1978). Images PMID:6261015

  18. SWAP-Assembler: scalable and efficient genome assembly towards thousands of cores.

    Science.gov (United States)

    Meng, Jintao; Wang, Bingqiang; Wei, Yanjie; Feng, Shengzhong; Balaji, Pavan

    2014-01-01

    There is a widening gap between the throughput of massive parallel sequencing machines and the ability to analyze these sequencing data. Traditional assembly methods requiring long execution time and large amount of memory on a single workstation limit their use on these massive data. This paper presents a highly scalable assembler named as SWAP-Assembler for processing massive sequencing data using thousands of cores, where SWAP is an acronym for Small World Asynchronous Parallel model. In the paper, a mathematical description of multi-step bi-directed graph (MSG) is provided to resolve the computational interdependence on merging edges, and a highly scalable computational framework for SWAP is developed to automatically preform the parallel computation of all operations. Graph cleaning and contig extension are also included for generating contigs with high quality. Experimental results show that SWAP-Assembler scales up to 2048 cores on Yanhuang dataset using only 26 minutes, which is better than several other parallel assemblers, such as ABySS, Ray, and PASHA. Results also show that SWAP-Assembler can generate high quality contigs with good N50 size and low error rate, especially it generated the longest N50 contig sizes for Fish and Yanhuang datasets. In this paper, we presented a highly scalable and efficient genome assembly software, SWAP-Assembler. Compared with several other assemblers, it showed very good performance in terms of scalability and contig quality. This software is available at: https://sourceforge.net/projects/swapassembler.

  19. Short-read reading-frame predictors are not created equal: sequence error causes loss of signal

    Directory of Open Access Journals (Sweden)

    Trimble William L

    2012-07-01

    Full Text Available Abstract Background Gene prediction algorithms (or gene callers are an essential tool for analyzing shotgun nucleic acid sequence data. Gene prediction is a ubiquitous step in sequence analysis pipelines; it reduces the volume of data by identifying the most likely reading frame for a fragment, permitting the out-of-frame translations to be ignored. In this study we evaluate five widely used ab initio gene-calling algorithms—FragGeneScan, MetaGeneAnnotator, MetaGeneMark, Orphelia, and Prodigal—for accuracy on short (75–1000 bp fragments containing sequence error from previously published artificial data and “real” metagenomic datasets. Results While gene prediction tools have similar accuracies predicting genes on error-free fragments, in the presence of sequencing errors considerable differences between tools become evident. For error-containing short reads, FragGeneScan finds more prokaryotic coding regions than does MetaGeneAnnotator, MetaGeneMark, Orphelia, or Prodigal. This improved detection of genes in error-containing fragments, however, comes at the cost of much lower (50% specificity and overprediction of genes in noncoding regions. Conclusions Ab initio gene callers offer a significant reduction in the computational burden of annotating individual nucleic acid reads and are used in many metagenomic annotation systems. For predicting reading frames on raw reads, we find the hidden Markov model approach in FragGeneScan is more sensitive than other gene prediction tools, while Prodigal, MGA, and MGM are better suited for higher-quality sequences such as assembled contigs.

  20. Extracellular matrix fragmentation in young, healthy cartilaginous tissues.

    Science.gov (United States)

    Craddock, R J; Hodson, N W; Ozols, M; Shearer, T; Hoyland, J A; Sherratt, M J

    2018-02-09

    Although the composition and structure of cartilaginous tissues is complex, collagen II fibrils and aggrecan are the most abundant assemblies in both articular cartilage (AC) and the nucleus pulposus (NP) of the intervertebral disc (IVD). Whilst structural heterogeneity of intact aggrecan ( containing three globular domains) is w