WorldWideScience

Sample records for hiv-1 genome sequences

  1. Workup of Human Blood Samples for Deep Sequencing of HIV-1 Genomes

    Cornelissen, Marion; Gall, Astrid; van der Kuyl, Antoinette; Wymant, Chris; Blanquart, François; Fraser, Christophe; Berkhout, Ben

    2018-01-01

    We describe a detailed protocol for the manual workup of blood (plasma/serum) samples from individuals infected with the human immunodeficiency virus type 1 (HIV-1) for deep sequence analysis of the viral genome. The study optimizing the assay was performed in the context of the BEEHIVE (Bridging

  2. Subtype-independent near full-length HIV-1 genome sequencing and assembly to be used in large molecular epidemiological studies and clinical management.

    Grossmann, Sebastian; Nowak, Piotr; Neogi, Ujjwal

    2015-01-01

    HIV-1 near full-length genome (HIV-NFLG) sequencing from plasma is an attractive multidimensional tool to apply in large-scale population-based molecular epidemiological studies. It also enables genotypic resistance testing (GRT) for all drug target sites allowing effective intervention strategies for control and prevention in high-risk population groups. Thus, the main objective of this study was to develop a simplified subtype-independent, cost- and labour-efficient HIV-NFLG protocol that can be used in clinical management as well as in molecular epidemiological studies. Plasma samples (n=30) were obtained from HIV-1B (n=10), HIV-1C (n=10), CRF01_AE (n=5) and CRF01_AG (n=5) infected individuals with minimum viral load >1120 copies/ml. The amplification was performed with two large amplicons of 5.5 kb and 3.7 kb, sequenced with 17 primers to obtain HIV-NFLG. GRT was validated against ViroSeq™ HIV-1 Genotyping System. After excluding four plasma samples with low-quality RNA, a total of 26 samples were attempted. Among them, NFLG was obtained from 24 (92%) samples with the lowest viral load being 3000 copies/ml. High (>99%) concordance was observed between HIV-NFLG and ViroSeq™ when determining the drug resistance mutations (DRMs). The N384I connection mutation was additionally detected by NFLG in two samples. Our high efficiency subtype-independent HIV-NFLG is a simple and promising approach to be used in large-scale molecular epidemiological studies. It will facilitate the understanding of the HIV-1 pandemic population dynamics and outline effective intervention strategies. Furthermore, it can potentially be applicable in clinical management of drug resistance by evaluating DRMs against all available antiretrovirals in a single assay.

  3. W-curve alignments for HIV-1 genomic comparisons.

    Douglas J Cork

    2010-06-01

    Full Text Available The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly.We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison.The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE.Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison

  4. W-curve alignments for HIV-1 genomic comparisons.

    Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H

    2010-06-01

    The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of

  5. Rare HIV-1 Subtype J Genomes and a New H/U/CRF02_AG Recombinant Genome Suggests an Ancient Origin of HIV-1 in Angola.

    Bártolo, Inês; Calado, Rita; Borrego, Pedro; Leitner, Thomas; Taveira, Nuno

    2016-08-01

    Angola has an extremely diverse HIV-1 epidemic fueled in part by the frequent interchange of people with the Democratic Republic of Congo (DRC) and Republic of Congo (RC). Characterization of HIV-1 strains circulating in Angola should help to better understand the origin of HIV-1 subtypes and recombinant forms and their transmission dynamics. In this study we characterize the first near full-length HIV-1 genomic sequences from HIV-1 infected individuals from Angola. Samples were obtained in 1993 from three HIV-1 infected patients living in Cabinda, Angola. Near full-length genomic sequences were obtained from virus isolates. Maximum likelihood phylogenetic tree inference and analyses of potential recombination patterns were performed to evaluate the sequence classifications and origins. Phylogenetic and recombination analyses revealed that one virus was a pure subtype J, another mostly subtype J with a small uncertain region, and the final virus was classified as a H/U/CRF02_AG recombinant. Consistent with their epidemiological data, the subtype J sequences were more closely related to each other than to other J sequences previously published. Based on the env gene, taxa from Angola occur throughout the global subtype J phylogeny. HIV-1 subtypes J and H are present in Angola at low levels since at least 1993. Low transmission efficiency and/or high recombination potential may explain their limited epidemic success in Angola and worldwide. The high diversity of rare subtypes in Angola suggests that Angola was part of the early establishment of the HIV-1 pandemic.

  6. RNA interactions in the 5' region of the HIV-1 genome

    Damgaard, Christian Kroun; Andersen, Ebbe Sloth; Knudsen, Bjarne

    2004-01-01

    The untranslated leader of the dimeric HIV-1 RNA genome is folded into a complex structure that plays multiple and essential roles in the viral replication cycle. Here, we have investigated secondary and tertiary structural elements within the 5' 744 nucleotides of the HIV-1 genome using...... a combination of bioinformatics, enzymatic probing, native gel electrophoresis, and UV-crosslinking experiments. We used a recently developed RNA folding algorithm (Pfold) to predict the common secondary structure of an alignment of 20 divergent HIV-1 sequences. Combining this analysis with biochemical data, we...

  7. Intragenic HIV-1 env sequences that enhance gag expression

    Suptawiwat, Ornpreya; Sutthent, Ruengpung; Lee, T.-H.; Auewarakul, Prasert

    2003-01-01

    Expression of HIV-1 genes is regulated at multiple levels including the complex RNA splicing and transport mechanisms. Multiple cis-acting elements involved in these regulations have been previously identified in various regions of HIV-1 genome. Here we show that another cis-acting element was present in HIV-1 env region. This element enhanced the expression of Gag when inserted together with Rev response element (RRE) into a truncated HIV-1 genome in the presence of Rev. The enhancing activity was mapped to a 263-bp fragment in the gp41 region downstream to RRE. RNA analysis showed that it might function by promoting RNA stability and Rev-dependent RNA export. The enhancement was specific to Rev-dependent expression, since it did not enhance Gag expression driven by Sam68, a cellular protein that has been shown to be able to substitute for Rev in RNA export function

  8. A small set of succinct signature patterns distinguishes Chinese and non-Chinese HIV-1 genomes.

    Yan Wang

    Full Text Available The epidemiology of HIV-1 in China has unique features that may have led to unique viral strains. We therefore tested the hypothesis that it is possible to find distinctive patterns in HIV-1 genomes sampled in China. Using a rule inference algorithm we could indeed extract from sequences of the third variable loop (V3 of HIV-1 gp120 a set of 14 signature patterns that with 89% accuracy distinguished Chinese from non-Chinese sequences. These patterns were found to be specific to HIV-1 subtype, i.e. sequences complying with pattern 1 were of subtype B, pattern 2 almost exclusively covered sequences of subtype 01_AE, etc. We then analyzed the first of these signature patterns in depth, namely that L and W at two V3 positions are specifically occurring in Chinese sequences of subtype B/B' (3% false positives. This pattern was found to be in agreement with the phylogeny of HIV-1 of subtype B inside and outside of China. We could neither reject nor convincingly confirm that the pattern is stabilized by immune escape. For further interpretation of the signature pattern we used the recently developed measure of Direct Information, and in this way discovered evidence for physical interactions between V2 and V3. We conclude by a discussion of limitations of signature patterns, and the applicability of the approach to other genomic regions and other countries.

  9. Analysis of dinucleotide signatures in HIV-1 subtype B genomes

    It was also shown that the profile generated by taking all dinucleotides together ... Keywords. genome signature; DRAP; HIV-1; chaos game representation. Journal of .... be used to quantify low levels of variation as are observed within species ..... Dayton A.I., Sodroski J.G., Rosen C.A., Goh W.C. and Haseltine. W.A. 1986 ...

  10. Detection of Hepatitis B Virus (HBV) Genomes and HBV Drug Resistant Variants by Deep Sequencing Analysis of HBV Genomes in Immune Cell Subsets of HBV Mono-Infected and/or Human Immunodeficiency Virus Type-1 (HIV-1) and HBV Co-Infected Individuals

    Lee, Z.; Nishikawa, S.; Gao, S.; Eksteen, J. B.; Czub, M.; Gill, M. J.; Osiowy, C.; van der Meer, F.; van Marle, G.; Coffin, C. S.

    2015-01-01

    The hepatitis B virus (HBV) and the human immunodeficiency virus type 1 (HIV-1) can infect cells of the lymphatic system. It is unknown whether HIV-1 co-infection impacts infection of peripheral blood mononuclear cell (PBMC) subsets by the HBV. Aims To compare the detection of HBV genomes and HBV sequences in unsorted PBMCs and subsets (i.e., CD4+ T, CD8+ T, CD14+ monocytes, CD19+ B, CD56+ NK cells) in HBV mono-infected vs. HBV/HIV-1 co-infected individuals. Methods Total PBMC and subsets isolated from 14 HBV mono-infected (4/14 before and after anti-HBV therapy) and 6 HBV/HIV-1 co-infected individuals (5/6 consistently on dual active anti-HBV/HIV therapy) were tested for HBV genomes, including replication indicative HBV covalently closed circular (ccc)-DNA, by nested PCR/nucleic hybridization and/or quantitative PCR. In CD4+, and/or CD56+ subsets from two HBV monoinfected cases, the HBV polymerase/overlapping surface region was analyzed by next generation sequencing. Results All analyzed whole PBMC from HBV monoinfected and HBV/HIV coinfected individuals were HBV genome positive. Similarly, HBV DNA was detected in all target PBMC subsets regardless of antiviral therapy, but was absent from the CD4+ T cell subset from all HBV/HIV-1 positive cases (PHBV monoinfected cases on tenofovir therapy, mutations at residues associated with drug resistance and/or immune escape (i.e., G145R) were detected in a minor percentage of the population. Summary HBV genomes and drug resistant variants were detectable in PBMC subsets from HBV mono-infected individuals. The HBV replicates in PBMC subsets of HBV/HIV-1 patients except the CD4+ T cell subpopulation. PMID:26390290

  11. Eliminating HIV-1 Packaging Sequences from Lentiviral Vector Proviruses Enhances Safety and Expedites Gene Transfer for Gene Therapy.

    Vink, Conrad A; Counsell, John R; Perocheau, Dany P; Karda, Rajvinder; Buckley, Suzanne M K; Brugman, Martijn H; Galla, Melanie; Schambach, Axel; McKay, Tristan R; Waddington, Simon N; Howe, Steven J

    2017-08-02

    Lentiviral vector genomic RNA requires sequences that partially overlap wild-type HIV-1 gag and env genes for packaging into vector particles. These HIV-1 packaging sequences constitute 19.6% of the wild-type HIV-1 genome and contain functional cis elements that potentially compromise clinical safety. Here, we describe the development of a novel lentiviral vector (LTR1) with a unique genomic structure designed to prevent transfer of HIV-1 packaging sequences to patient cells, thus reducing the total HIV-1 content to just 4.8% of the wild-type genome. This has been achieved by reconfiguring the vector to mediate reverse-transcription with a single strand transfer, instead of the usual two, and in which HIV-1 packaging sequences are not copied. We show that LTR1 vectors offer improved safety in their resistance to remobilization in HIV-1 particles and reduced frequency of splicing into human genes. Following intravenous luciferase vector administration to neonatal mice, LTR1 sustained a higher level of liver transgene expression than an equivalent dose of a standard lentivirus. LTR1 vectors produce reverse-transcription products earlier and start to express transgenes significantly quicker than standard lentiviruses after transduction. Finally, we show that LTR1 is an effective lentiviral gene therapy vector as demonstrated by correction of a mouse hemophilia B model. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  12. Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants

    2014-01-01

    Background Following transmission, HIV-1 evolves into a diverse population, and next generation sequencing enables us to detect variants occurring at low frequencies. Studying viral evolution at the level of whole genomes was hitherto not possible because next generation sequencing delivers relatively short reads. Results We here provide a proof of principle that whole HIV-1 genomes can be reliably reconstructed from short reads, and use this to study the selection of immune escape mutations at the level of whole genome haplotypes. Using realistically simulated HIV-1 populations, we demonstrate that reconstruction of complete genome haplotypes is feasible with high fidelity. We do not reconstruct all genetically distinct genomes, but each reconstructed haplotype represents one or more of the quasispecies in the HIV-1 population. We then reconstruct 30 whole genome haplotypes from published short sequence reads sampled longitudinally from a single HIV-1 infected patient. We confirm the reliability of the reconstruction by validating our predicted haplotype genes with single genome amplification sequences, and by comparing haplotype frequencies with observed epitope escape frequencies. Conclusions Phylogenetic analysis shows that the HIV-1 population undergoes selection driven evolution, with successive replacement of the viral population by novel dominant strains. We demonstrate that immune escape mutants evolve in a dependent manner with various mutations hitchhiking along with others. As a consequence of this clonal interference, selection coefficients have to be estimated for complete haplotypes and not for individual immune escapes. PMID:24996694

  13. Detection of viral sequence fragments of HIV-1 subfamilies yet unknown

    Stanke Mario

    2011-04-01

    Full Text Available Abstract Background Methods of determining whether or not any particular HIV-1 sequence stems - completely or in part - from some unknown HIV-1 subtype are important for the design of vaccines and molecular detection systems, as well as for epidemiological monitoring. Nevertheless, a single algorithm only, the Branching Index (BI, has been developed for this task so far. Moving along the genome of a query sequence in a sliding window, the BI computes a ratio quantifying how closely the query sequence clusters with a subtype clade. In its current version, however, the BI does not provide predicted boundaries of unknown fragments. Results We have developed Unknown Subtype Finder (USF, an algorithm based on a probabilistic model, which automatically determines which parts of an input sequence originate from a subtype yet unknown. The underlying model is based on a simple profile hidden Markov model (pHMM for each known subtype and an additional pHMM for an unknown subtype. The emission probabilities of the latter are estimated using the emission frequencies of the known subtypes by means of a (position-wise probabilistic model for the emergence of new subtypes. We have applied USF to SIV and HIV-1 sequences formerly classified as having emerged from an unknown subtype. Moreover, we have evaluated its performance on artificial HIV-1 recombinants and non-recombinant HIV-1 sequences. The results have been compared with the corresponding results of the BI. Conclusions Our results demonstrate that USF is suitable for detecting segments in HIV-1 sequences stemming from yet unknown subtypes. Comparing USF with the BI shows that our algorithm performs as good as the BI or better.

  14. Genome Sequencing

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  15. Variability of HIV-1 genomes among children and adolescents from Sao Paulo, Brazil.

    Sabri Saeed Sanabani

    Full Text Available BACKGROUND: Genetic variability is a major feature of the human immunodeficiency virus type 1 (HIV-1 and considered the key factor to frustrating efforts to halt the virus epidemic. In this study, we aimed to investigate the genetic variability of HIV-1 strains among children and adolescents born from 1992 to 2009 in the state of Sao Paulo, Brazil. METHODOLOGY: Plasma and peripheral blood mononuclear cells (PBMC were collected from 51 HIV-1-positive children and adolescents on ART followed between September 1992 and July 2009. After extraction, the genetic materials were used in a polymerase chain reaction (PCR to amplify the viral near full length genomes (NFLGs from 5 overlapped fragments. NFLGs and partial amplicons were directly sequenced and data were phylogenetically inferred. RESULTS: Of the 51 samples studied, the NFLGs and partial fragments of HIV-1 from 42 PBMCs and 25 plasma were successfully subtyped. Results based on proviral DNA revealed that 22 (52.4% patients were infected with subtype B, 16 (38.1% were infected with BF1 mosaic variants and 4 (9.5% were infected with sub-subtype F1. All the BF1 recombinants were unique and distinct from any previously identified unique or circulating recombinant forms in South America. Evidence of dual infections was detected in 3 patients coinfected with the same or distinct HIV-1 subtypes. Ten of the 31 (32.2% and 12 of the 21 (57.1% subjects with recovered proviral and plasma, respectively, protease sequences were infected with major mutants resistant to protease inhibitors. The V3 sequences of 14 patients with available sequences from PBMC/or plasma were predicted to be R5-tropic virus except for two patients who harbored an X4 strain. CONCLUSIONS: The high proportion of HIV-1 BF1 recombinant, coinfection rate and vertical transmission in Brazil merits urgent attention and effective measures to reduce the transmission of HIV among spouses and sex partners.

  16. SL1 revisited: functional analysis of the structure and conformation of HIV-1 genome RNA.

    Sakuragi, Sayuri; Yokoyama, Masaru; Shioda, Tatsuo; Sato, Hironori; Sakuragi, Jun-Ichi

    2016-11-11

    The dimer initiation site/dimer linkage sequence (DIS/DLS) region of HIV is located on the 5' end of the viral genome and suggested to form complex secondary/tertiary structures. Within this structure, stem-loop 1 (SL1) is believed to be most important and an essential key to dimerization, since the sequence and predicted secondary structure of SL1 are highly stable and conserved among various virus subtypes. In particular, a six-base palindromic sequence is always present at the hairpin loop of SL1 and the formation of kissing-loop structure at this position between the two strands of genomic RNA is suggested to trigger dimerization. Although the higher-order structure model of SL1 is well accepted and perhaps even undoubted lately, there could be stillroom for consideration to depict the functional SL1 structure while in vivo (in virion or cell). In this study, we performed several analyses to identify the nucleotides and/or basepairing within SL1 which are necessary for HIV-1 genome dimerization, encapsidation, recombination and infectivity. We unexpectedly found that some nucleotides that are believed to contribute the formation of the stem do not impact dimerization or infectivity. On the other hand, we found that one G-C basepair involved in stem formation may serve as an alternative dimer interactive site. We also report on our further investigation of the roles of the palindromic sequences on viral replication. Collectively, we aim to assemble a more-comprehensive functional map of SL1 on the HIV-1 viral life cycle. We discovered several possibilities for a novel structure of SL1 in HIV-1 DLS. The newly proposed structure model suggested that the hairpin loop of SL1 appeared larger, and genome dimerization process might consist of more complicated mechanism than previously understood. Further investigations would be still required to fully understand the genome packaging and dimerization of HIV.

  17. Clonal expansion of genome-intact HIV-1 in functionally polarized Th1 CD4+ T cells.

    Lee, Guinevere Q; Orlova-Fink, Nina; Einkauf, Kevin; Chowdhury, Fatema Z; Sun, Xiaoming; Harrington, Sean; Kuo, Hsiao-Hsuan; Hua, Stephane; Chen, Hsiao-Rong; Ouyang, Zhengyu; Reddy, Kavidha; Dong, Krista; Ndung'u, Thumbi; Walker, Bruce D; Rosenberg, Eric S; Yu, Xu G; Lichterfeld, Mathias

    2017-06-30

    HIV-1 causes a chronic, incurable disease due to its persistence in CD4+ T cells that contain replication-competent provirus, but exhibit little or no active viral gene expression and effectively resist combination antiretroviral therapy (cART). These latently infected T cells represent an extremely small proportion of all circulating CD4+ T cells but possess a remarkable long-term stability and typically persist throughout life, for reasons that are not fully understood. Here we performed massive single-genome, near-full-length next-generation sequencing of HIV-1 DNA derived from unfractionated peripheral blood mononuclear cells, ex vivo-isolated CD4+ T cells, and subsets of functionally polarized memory CD4+ T cells. This approach identified multiple sets of independent, near-full-length proviral sequences from cART-treated individuals that were completely identical, consistent with clonal expansion of CD4+ T cells harboring intact HIV-1. Intact, near-full-genome HIV-1 DNA sequences that were derived from such clonally expanded CD4+ T cells constituted 62% of all analyzed genome-intact sequences in memory CD4 T cells, were preferentially observed in Th1-polarized cells, were longitudinally detected over a duration of up to 5 years, and were fully replication- and infection-competent. Together, these data suggest that clonal proliferation of Th1-polarized CD4+ T cells encoding for intact HIV-1 represents a driving force for stabilizing the pool of latently infected CD4+ T cells.

  18. Use of Dried Blood Spots to Elucidate Full-Length Transmitted/Founder HIV-1 Genomes

    Jesus F. Salazar-Gonzalez

    2016-07-01

    Full Text Available Background: Identification of HIV-1 genomes responsible for establishing clinical infection in newly infected individuals is fundamental to prevention and pathogenesis research. Processing, storage, and transportation of the clinical samples required to perform these virologic assays in resource-limited settings requires challenging venipuncture and cold chain logistics. Here, we validate the use of dried-blood spots (DBS as a simple and convenient alternative to collecting and storing frozen plasma. Methods: We performed parallel nucleic acid extraction, single genome amplification (SGA, next generation sequencing (NGS, and phylogenetic analyses on plasma and DBS. Results: We demonstrated the capacity to extract viral RNA from DBS and perform SGA to infer the complete nucleotide sequence of the transmitted/founder (TF HIV-1 envelope gene and full-length genome in two acutely infected individuals. Using both SGA and NGS methodologies, we showed that sequences generated from DBS and plasma display comparable phylogenetic patterns in both acute and chronic infection. SGA was successful on samples with a range of plasma viremia, including samples as low as 1,700 copies/ml and an estimated ~50 viral copies per blood spot. Further, we demonstrated reproducible efficiency in gp160 env sequencing in DBS stored at ambient temperature for up to three weeks or at -20ºC for up to five months. Conclusions: These findings support the use of DBS as a practical and cost-effective alternative to frozen plasma for clinical trials and translational research conducted in resource-limited settings.

  19. Discovery of novel targets for multi-epitope vaccines: Screening of HIV-1 genomes using association rule mining

    Piontkivska Helen

    2009-07-01

    Full Text Available Abstract Background Studies have shown that in the genome of human immunodeficiency virus (HIV-1 regions responsible for interactions with the host's immune system, namely, cytotoxic T-lymphocyte (CTL epitopes tend to cluster together in relatively conserved regions. On the other hand, "epitope-less" regions or regions with relatively low density of epitopes tend to be more variable. However, very little is known about relationships among epitopes from different genes, in other words, whether particular epitopes from different genes would occur together in the same viral genome. To identify CTL epitopes in different genes that co-occur in HIV genomes, association rule mining was used. Results Using a set of 189 best-defined HIV-1 CTL/CD8+ epitopes from 9 different protein-coding genes, as described by Frahm, Linde & Brander (2007, we examined the complete genomic sequences of 62 reference HIV sequences (including 13 subtypes and sub-subtypes with approximately 4 representative sequences for each subtype or sub-subtype, and 18 circulating recombinant forms. The results showed that despite inclusion of recombinant sequences that would be expected to break-up associations of epitopes in different genes when two different genomes are recombined, there exist particular combinations of epitopes (epitope associations that occur repeatedly across the world-wide population of HIV-1. For example, Pol epitope LFLDGIDKA is found to be significantly associated with epitopes GHQAAMQML and FLKEKGGL from Gag and Nef, respectively, and this association rule is observed even among circulating recombinant forms. Conclusion We have identified CTL epitope combinations co-occurring in HIV-1 genomes including different subtypes and recombinant forms. Such co-occurrence has important implications for design of complex vaccines (multi-epitope vaccines and/or drugs that would target multiple HIV-1 regions at once and, thus, may be expected to overcome challenges

  20. Generation and Characterization of HIV-1 Transmitted and Founder Virus Consensus Sequence from Intravenous Drug Users in Xinjiang, China.

    Li, Fan; Ma, Liying; Feng, Yi; Hu, Jing; Ni, Na; Ruan, Yuhua; Shao, Yiming

    2017-06-01

    HIV-1 transmission in intravenous drug users (IDUs) has been characterized by high genetic multiplicity and suggests a greater challenge for HIV-1 infection blocking. We investigated a total of 749 sequences of full-length gp160 gene obtained by single genome sequencing (SGS) from 22 HIV-1 early infected IDUs in Xinjiang province, northwest China, and generated a transmitted and founder virus (T/F virus) consensus sequence (IDU.CON). The T/F virus was classified as subtype CRF07_BC and predicted to be CCR5-tropic virus. The variable region (V1, V2, and V4 loop) of IDU.CON showed length variation compared with the heterosexual T/F virus consensus sequence (HSX.CON) and homosexual T/F virus consensus sequence (MSM.CON). A total of 26 N-linked glycosylation sites were discovered in the IDU.CON sequence, which is less than that of MSM.CON and HSX.CON. Characterization of T/F virus from IDUs highlights the genetic make-up and complexity of virus near the moment of transmission or in early infection preceding systemic dissemination and is important toward the development of an effective HIV-1 preventive methods, including vaccines.

  1. Phylogeny and resistance profiles of HIV-1 POL sequences from rectal biopsies and blood

    Katzenstein, Terese Lea; Petersen, A B; Storgaard, M

    2010-01-01

    The phylogeny and resistance profiles of human immunodeficiency virus type 1 (HIV-1) protease (PR) and reverse transcriptase (RT) sequences were compared among six patients with HIV-1 who had received numerous treatments. RNA and DNA fractions were obtained from concurrent blood and rectal biopsy...

  2. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  3. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    Archer, John; Weber, Jan; Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J; Robertson, David L; Mimms, Larry; Quiñones-Mateu, Miguel E

    2012-01-01

    HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  4. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    John Archer

    Full Text Available HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5 viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences and genotypic (e.g., population sequencing linked to bioinformatic algorithms assays are the most widely used. Although several next-generation sequencing (NGS platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences, Illumina®, and Ion Torrent™ (Life Technologies. Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used, compared to Trofile (80% and population sequencing (70%. In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  5. nef gene sequence variation among HIV-1-infected African children

    Chakraborty, R.; Reiniš, Milan; Rostron, T.; Philpott, S.; Dong, T.; D'Agostino, A.; Musoke, R.; de Silva, E.; Stumpf, M.; Weiser, B.; Burger, H.; Rowland-Jones, S.L.

    2006-01-01

    Roč. 7, č. 2 (2006), s. 75-84 ISSN 1464-2662 Grant - others:Fogarty International Center, NIH(US) 3D43TW00915; NIH(US) RO1 AI 42555 Institutional research plan: CEZ:AV0Z50520514 Keywords : HIV-1 nef gene * non-clade B * Kenya Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.674, year: 2006

  6. Interactions Between HIV-1 Gag and Viral RNA Genome Enhance Virion Assembly

    Dilley, Kari A; Nikolaitchik, Olga A; Galli, Andrea

    2017-01-01

    between Gag and viral RNA are required for the enhancement of particle production. Taken together, these studies are consistent with our previous hypothesis that specific dimeric viral RNA:Gag interactions are the nucleation event of infectious virion assembly, ensuring that one RNA dimer is packaged......Most HIV-1 virions contain two copies of full-length viral RNA, indicating that genome packaging is efficient and tightly regulated. However, the structural protein Gag is the only component required for the assembly of noninfectious virus-like particles and the viral RNA is dispensable...... in this process. The mechanism that allows HIV-1 to achieve such high efficiency of genome packaging when a packageable viral RNA is not required for virus assembly is currently unknown. In this report, we examined the role of HIV-1 RNA in virus assembly and found that packageable HIV-1 RNA enhances particle...

  7. HIV-1 envelope sequence-based diversity measures for identifying recent infections.

    Alexis Kafando

    Full Text Available Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A total of 597 diagnostic samples obtained in 2013 and 2015 from recently and chronically HIV-1 infected individuals were selected. From the selected samples, 249 (134 from recent versus 115 from chronic infections env coding regions, including V1-C5 of gp120 and the gp41 ectodomain of HIV-1, were successfully amplified and sequenced by next generation sequencing (NGS using the Illumina MiSeq platform. The ability of the four sequence-based diversity measures to correctly identify recent HIV infections was evaluated using the frequency distribution curves, median and interquartile range and area under the curve (AUC of the receiver operating characteristic (ROC. Comparing the median and interquartile range and evaluating the frequency distribution curves associated with the 4 sequence-based diversity measures, we observed that the percent diversity, number of haplotypes and Shannon entropy demonstrated significant potential to discriminate recent from chronic infections (p<0.0001. Using the AUC of ROC analysis, only the Shannon entropy measure within three HIV-1 env segments could accurately identify recent infections at a satisfactory level. The env segments were gp120 C2_1 (AUC = 0.806, gp120 C2_3 (AUC = 0.805 and gp120 V3 (AUC = 0.812. Our results clearly indicate that the Shannon entropy measure represents a useful tool for predicting HIV-1 infection recency.

  8. Deciphering the role of the Gag-Pol ribosomal frameshift signal in HIV-1 RNA genome packaging.

    Nikolaitchik, Olga A; Hu, Wei-Shau

    2014-04-01

    A key step of retroviral replication is packaging of the viral RNA genome during virus assembly. Specific packaging is mediated by interactions between the viral protein Gag and elements in the viral RNA genome. In HIV-1, similar to most retroviruses, the packaging signal is located within the 5' untranslated region and extends into the gag-coding region. A recent study reported that a region including the Gag-Pol ribosomal frameshift signal plays an important role in HIV-1 RNA packaging; deletions or mutations that affect the RNA structure of this signal lead to drastic decreases (10- to 50-fold) in viral RNA packaging and virus titer. We examined here the role of the ribosomal frameshift signal in HIV-1 RNA packaging by studying the RNA packaging and virus titer in the context of proviruses. Three mutants with altered ribosomal frameshift signal, either through direct deletion of the signal, mutation of the 6U slippery sequence, or alterations of the secondary structure were examined. We found that RNAs from all three mutants were packaged efficiently, and they generate titers similar to that of a virus containing the wild-type ribosomal frameshift signal. We conclude that although the ribosomal frameshift signal plays an important role in regulating the replication cycle, this RNA element is not directly involved in regulating RNA encapsidation. To generate infectious viruses, HIV-1 must package viral RNA genome during virus assembly. The specific HIV-1 genome packaging is mediated by interactions between the structural protein Gag and elements near the 5' end of the viral RNA known as packaging signal. In this study, we examined whether the Gag-Pol ribosomal frameshift signal is important for HIV-1 RNA packaging as recently reported. Our results demonstrated that when Gag/Gag-Pol is supplied in trans, none of the tested ribosomal frameshift signal mutants has defects in RNA packaging or virus titer. These studies provide important information on how HIV-1

  9. Genome-wide association scan in HIV-1-infected individuals identifying variants influencing disease course.

    Daniëlle van Manen

    Full Text Available BACKGROUND: AIDS develops typically after 7-11 years of untreated HIV-1 infection, with extremes of very rapid disease progression (15 years. To reveal additional host genetic factors that may impact on the clinical course of HIV-1 infection, we designed a genome-wide association study (GWAS in 404 participants of the Amsterdam Cohort Studies on HIV-1 infection and AIDS. METHODS: The association of SNP genotypes with the clinical course of HIV-1 infection was tested in Cox regression survival analyses using AIDS-diagnosis and AIDS-related death as endpoints. RESULTS: Multiple, not previously identified SNPs, were identified to be strongly associated with disease progression after HIV-1 infection, albeit not genome-wide significant. However, three independent SNPs in the top ten associations between SNP genotypes and time between seroconversion and AIDS-diagnosis, and one from the top ten associations between SNP genotypes and time between seroconversion and AIDS-related death, had P-values smaller than 0.05 in the French Genomics of Resistance to Immunodeficiency Virus cohort on disease progression. CONCLUSIONS: Our study emphasizes that the use of different phenotypes in GWAS may be useful to unravel the full spectrum of host genetic factors that may be associated with the clinical course of HIV-1 infection.

  10. Genome-Wide Association Scan in HIV-1-Infected Individuals Identifying Variants Influencing Disease Course

    van Manen, Daniëlle; Delaneau, Olivier; Kootstra, Neeltje A.; Boeser-Nunnink, Brigitte D.; Limou, Sophie; Bol, Sebastiaan M.; Burger, Judith A.; Zwinderman, Aeilko H.; Moerland, Perry D.; van 't Slot, Ruben; Zagury, Jean-François; van 't Wout, Angélique B.; Schuitemaker, Hanneke

    2011-01-01

    Background AIDS develops typically after 7–11 years of untreated HIV-1 infection, with extremes of very rapid disease progression (15 years). To reveal additional host genetic factors that may impact on the clinical course of HIV-1 infection, we designed a genome-wide association study (GWAS) in 404 participants of the Amsterdam Cohort Studies on HIV-1 infection and AIDS. Methods The association of SNP genotypes with the clinical course of HIV-1 infection was tested in Cox regression survival analyses using AIDS-diagnosis and AIDS-related death as endpoints. Results Multiple, not previously identified SNPs, were identified to be strongly associated with disease progression after HIV-1 infection, albeit not genome-wide significant. However, three independent SNPs in the top ten associations between SNP genotypes and time between seroconversion and AIDS-diagnosis, and one from the top ten associations between SNP genotypes and time between seroconversion and AIDS-related death, had P-values smaller than 0.05 in the French Genomics of Resistance to Immunodeficiency Virus cohort on disease progression. Conclusions Our study emphasizes that the use of different phenotypes in GWAS may be useful to unravel the full spectrum of host genetic factors that may be associated with the clinical course of HIV-1 infection. PMID:21811574

  11. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    Sükösd, Zsuzsanna; Andersen, Ebbe Sloth; Seemann, Ernst Stefan

    2015-01-01

    of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping...

  12. Phylogeny and resistance profiles of HIV-1 POL sequences from rectal biopsies and blood

    Katzenstein, T L; Petersen, A B; Storgaard, M

    2010-01-01

    The phylogeny and resistance profiles of human immunodeficiency virus type 1 (HIV-1) protease (PR) and reverse transcriptase (RT) sequences were compared among six patients with HIV-1 who had received numerous treatments. RNA and DNA fractions were obtained from concurrent blood and rectal biopsy...... samples. Phylogenetic trees and resistance profiles showed that the rectal mucosa and the peripheral blood mononuclear cells (PBMCs) harbored different HIV-1 strains. The resistance-associated mutations found in each strain corresponded to the treatment history of the patients. The resistance mutations...... acquired during earlier treatment regimens were detected in the sequences obtained from the rectal samples and in the PBMCs in several of the patients. Also, differences in the resistance profiles were observed between anatomical sites and between RNA and DNA fractions. Thus, a single sample probably...

  13. Structural determinants and mechanism of HIV-1 genome packaging.

    Lu, Kun; Heng, Xiao; Summers, Michael F

    2011-07-22

    Like all retroviruses, the human immunodeficiency virus selectively packages two copies of its unspliced RNA genome, both of which are utilized for strand-transfer-mediated recombination during reverse transcription-a process that enables rapid evolution under environmental and chemotherapeutic pressures. The viral RNA appears to be selected for packaging as a dimer, and there is evidence that dimerization and packaging are mechanistically coupled. Both processes are mediated by interactions between the nucleocapsid domains of a small number of assembling viral Gag polyproteins and RNA elements within the 5'-untranslated region of the genome. A number of secondary structures have been predicted for regions of the genome that are responsible for packaging, and high-resolution structures have been determined for a few small RNA fragments and protein-RNA complexes. However, major questions regarding the RNA structures (and potentially the structural changes) that are responsible for dimeric genome selection remain unanswered. Here, we review efforts that have been made to identify the molecular determinants and mechanism of human immunodeficiency virus type 1 genome packaging. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. Forced evolution of a regulatory RNA helix in the HIV-1 genome

    Berkhout, B.; Klaver, B.; Das, A. T.

    1997-01-01

    The 5'and 3'end of the HIV-1 RNA genome forms a repeat (R) element that encodes a double stem-loop structure (the TAR and polyA hairpins). Phylogenetic analysis of the polyA hairpin in different human and simian immunodeficiency viruses suggests that the thermodynamic stability of the helix is

  15. Distinct binding interactions of HIV-1 Gag to Psi and non-Psi RNAs: Implications for viral genomic RNA packaging

    Webb, Joseph A.; Jones, Christopher P.; Parent, Leslie J.; Rouzina, Ioulia; Musier-Forsyth, Karin

    2013-01-01

    The mechanism underlying the selective packaging of genomic RNA into HIV-1 virions is not known. This paper provides important new biophysical insights into the nature of protein–RNA interactions responsible for HIV-1 genome packaging by quantifying the electrostatic and hydrophobic contributions to specific and nonspecific RNA.

  16. Analysis of HIV-1 intersubtype recombination breakpoints suggests region with high pairing probability may be a more fundamental factor than sequence similarity affecting HIV-1 recombination.

    Jia, Lei; Li, Lin; Gui, Tao; Liu, Siyang; Li, Hanping; Han, Jingwan; Guo, Wei; Liu, Yongjian; Li, Jingyun

    2016-09-21

    With increasing data on HIV-1, a more relevant molecular model describing mechanism details of HIV-1 genetic recombination usually requires upgrades. Currently an incomplete structural understanding of the copy choice mechanism along with several other issues in the field that lack elucidation led us to perform an analysis of the correlation between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarity to further explore structural mechanisms. Near full length sequences of URFs from Asia, Europe, and Africa (one sequence/patient), and representative sequences of worldwide CRFs were retrieved from the Los Alamos HIV database. Their recombination patterns were analyzed by jpHMM in detail. Then the relationships between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarities were investigated. Pearson correlation test showed that all URF groups and the CRF group exhibit the same breakpoint distribution pattern. Additionally, the Wilcoxon two-sample test indicated a significant and inexplicable limitation of recombination in regions with high pairing probability. These regions have been found to be strongly conserved across distinct biological states (i.e., strong intersubtype similarity), and genetic similarity has been determined to be a very important factor promoting recombination. Thus, the results revealed an unexpected disagreement between intersubtype similarity and breakpoint distribution, which were further confirmed by genetic similarity analysis. Our analysis reveals a critical conflict between results from natural HIV-1 isolates and those from HIV-1-based assay vectors in which genetic similarity has been shown to be a very critical factor promoting recombination. These results indicate the region with high-pairing probabilities may be a more fundamental factor affecting HIV-1 recombination than sequence similarity in natural HIV-1 infections. Our

  17. Structure and possible function of a G-quadruplex in the long terminal repeat of the proviral HIV-1 genome.

    De Nicola, Beatrice; Lech, Christopher J; Heddi, Brahim; Regmi, Sagar; Frasson, Ilaria; Perrone, Rosalba; Richter, Sara N; Phan, Anh Tuân

    2016-07-27

    The long terminal repeat (LTR) of the proviral human immunodeficiency virus (HIV)-1 genome is integral to virus transcription and host cell infection. The guanine-rich U3 region within the LTR promoter, previously shown to form G-quadruplex structures, represents an attractive target to inhibit HIV transcription and replication. In this work, we report the structure of a biologically relevant G-quadruplex within the LTR promoter region of HIV-1. The guanine-rich sequence designated LTR-IV forms a well-defined structure in physiological cationic solution. The nuclear magnetic resonance (NMR) structure of this sequence reveals a parallel-stranded G-quadruplex containing a single-nucleotide thymine bulge, which participates in a conserved stacking interaction with a neighboring single-nucleotide adenine loop. Transcription analysis in a HIV-1 replication competent cell indicates that the LTR-IV region may act as a modulator of G-quadruplex formation in the LTR promoter. Consequently, the LTR-IV G-quadruplex structure presented within this work could represent a valuable target for the design of HIV therapeutics. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Sieve analysis of breakthrough HIV-1 sequences in HVTN 505 identifies vaccine pressure targeting the CD4 binding site of Env-gp120.

    deCamp, Allan C; Rolland, Morgane; Edlefsen, Paul T; Sanders-Buell, Eric; Hall, Breana; Magaret, Craig A; Fiore-Gartland, Andrew J; Juraska, Michal; Carpp, Lindsay N; Karuna, Shelly T; Bose, Meera; LePore, Steven; Miller, Shana; O'Sullivan, Annemarie; Poltavee, Kultida; Bai, Hongjun; Dommaraju, Kalpana; Zhao, Hong; Wong, Kim; Chen, Lennie; Ahmed, Hasan; Goodman, Derrick; Tay, Matthew Z; Gottardo, Raphael; Koup, Richard A; Bailer, Robert; Mascola, John R; Graham, Barney S; Roederer, Mario; O'Connell, Robert J; Michael, Nelson L; Robb, Merlin L; Adams, Elizabeth; D'Souza, Patricia; Kublin, James; Corey, Lawrence; Geraghty, Daniel E; Frahm, Nicole; Tomaras, Georgia D; McElrath, M Juliana; Frenkel, Lisa; Styrchak, Sheila; Tovanabutra, Sodsai; Sobieszczyk, Magdalena E; Hammer, Scott M; Kim, Jerome H; Mullins, James I; Gilbert, Peter B

    2017-01-01

    Although the HVTN 505 DNA/recombinant adenovirus type 5 vector HIV-1 vaccine trial showed no overall efficacy, analysis of breakthrough HIV-1 sequences in participants can help determine whether vaccine-induced immune responses impacted viruses that caused infection. We analyzed 480 HIV-1 genomes sampled from 27 vaccine and 20 placebo recipients and found that intra-host HIV-1 diversity was significantly lower in vaccine recipients (P ≤ 0.04, Q-values ≤ 0.09) in Gag, Pol, Vif and envelope glycoprotein gp120 (Env-gp120). Furthermore, Env-gp120 sequences from vaccine recipients were significantly more distant from the subtype B vaccine insert than sequences from placebo recipients (P = 0.01, Q-value = 0.12). These vaccine effects were associated with signatures mapping to CD4 binding site and CD4-induced monoclonal antibody footprints. These results suggest either (i) no vaccine efficacy to block acquisition of any viral genotype but vaccine-accelerated Env evolution post-acquisition; or (ii) vaccine efficacy against HIV-1s with Env sequences closest to the vaccine insert combined with increased acquisition due to other factors, potentially including the vaccine vector.

  19. Comprehensive sieve analysis of breakthrough HIV-1 sequences in the RV144 vaccine efficacy trial.

    Edlefsen, Paul T; Rolland, Morgane; Hertz, Tomer; Tovanabutra, Sodsai; Gartland, Andrew J; deCamp, Allan C; Magaret, Craig A; Ahmed, Hasan; Gottardo, Raphael; Juraska, Michal; McCoy, Connor; Larsen, Brendan B; Sanders-Buell, Eric; Carrico, Chris; Menis, Sergey; Kijak, Gustavo H; Bose, Meera; Arroyo, Miguel A; O'Connell, Robert J; Nitayaphan, Sorachai; Pitisuttithum, Punnee; Kaewkungwal, Jaranit; Rerks-Ngarm, Supachai; Robb, Merlin L; Kirys, Tatsiana; Georgiev, Ivelin S; Kwong, Peter D; Scheffler, Konrad; Pond, Sergei L Kosakovsky; Carlson, Jonathan M; Michael, Nelson L; Schief, William R; Mullins, James I; Kim, Jerome H; Gilbert, Peter B

    2015-02-01

    The RV144 clinical trial showed the partial efficacy of a vaccine regimen with an estimated vaccine efficacy (VE) of 31% for protecting low-risk Thai volunteers against acquisition of HIV-1. The impact of vaccine-induced immune responses can be investigated through sieve analysis of HIV-1 breakthrough infections (infected vaccine and placebo recipients). A V1/V2-targeted comparison of the genomes of HIV-1 breakthrough viruses identified two V2 amino acid sites that differed between the vaccine and placebo groups. Here we extended the V1/V2 analysis to the entire HIV-1 genome using an array of methods based on individual sites, k-mers and genes/proteins. We identified 56 amino acid sites or "signatures" and 119 k-mers that differed between the vaccine and placebo groups. Of those, 19 sites and 38 k-mers were located in the regions comprising the RV144 vaccine (Env-gp120, Gag, and Pro). The nine signature sites in Env-gp120 were significantly enriched for known antibody-associated sites (p = 0.0021). In particular, site 317 in the third variable loop (V3) overlapped with a hotspot of antibody recognition, and sites 369 and 424 were linked to CD4 binding site neutralization. The identified signature sites significantly covaried with other sites across the genome (mean = 32.1) more than did non-signature sites (mean = 0.9) (p analysis of the breakthrough infections in the RV144 trial, this work describes a set of statistical methods and tools applicable to analysis of breakthrough infection genomes in general vaccine efficacy trials for diverse pathogens.

  20. Particle infectivity of HIV-1 full-length genome infectious molecular clones in a subtype C heterosexual transmission pair following high fidelity amplification and unbiased cloning

    Deymier, Martin J., E-mail: mdeymie@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Claiborne, Daniel T., E-mail: dclaibo@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Ende, Zachary, E-mail: zende@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Ratner, Hannah K., E-mail: hannah.ratner@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Kilembe, William, E-mail: wkilembe@rzhrg-mail.org [Zambia-Emory HIV Research Project (ZEHRP), B22/737 Mwembelelo, Emmasdale Post Net 412, P/BagE891, Lusaka (Zambia); Allen, Susan, E-mail: sallen5@emory.edu [Zambia-Emory HIV Research Project (ZEHRP), B22/737 Mwembelelo, Emmasdale Post Net 412, P/BagE891, Lusaka (Zambia); Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA (United States); Hunter, Eric, E-mail: eric.hunter2@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA (United States)

    2014-11-15

    The high genetic diversity of HIV-1 impedes high throughput, large-scale sequencing and full-length genome cloning by common restriction enzyme based methods. Applying novel methods that employ a high-fidelity polymerase for amplification and an unbiased fusion-based cloning strategy, we have generated several HIV-1 full-length genome infectious molecular clones from an epidemiologically linked transmission pair. These clones represent the transmitted/founder virus and phylogenetically diverse non-transmitted variants from the chronically infected individual's diverse quasispecies near the time of transmission. We demonstrate that, using this approach, PCR-induced mutations in full-length clones derived from their cognate single genome amplicons are rare. Furthermore, all eight non-transmitted genomes tested produced functional virus with a range of infectivities, belying the previous assumption that a majority of circulating viruses in chronic HIV-1 infection are defective. Thus, these methods provide important tools to update protocols in molecular biology that can be universally applied to the study of human viral pathogens. - Highlights: • Our novel methodology demonstrates accurate amplification and cloning of full-length HIV-1 genomes. • A majority of plasma derived HIV variants from a chronically infected individual are infectious. • The transmitted/founder was more infectious than the majority of the variants from the chronically infected donor.

  1. Particle infectivity of HIV-1 full-length genome infectious molecular clones in a subtype C heterosexual transmission pair following high fidelity amplification and unbiased cloning

    Deymier, Martin J.; Claiborne, Daniel T.; Ende, Zachary; Ratner, Hannah K.; Kilembe, William; Allen, Susan; Hunter, Eric

    2014-01-01

    The high genetic diversity of HIV-1 impedes high throughput, large-scale sequencing and full-length genome cloning by common restriction enzyme based methods. Applying novel methods that employ a high-fidelity polymerase for amplification and an unbiased fusion-based cloning strategy, we have generated several HIV-1 full-length genome infectious molecular clones from an epidemiologically linked transmission pair. These clones represent the transmitted/founder virus and phylogenetically diverse non-transmitted variants from the chronically infected individual's diverse quasispecies near the time of transmission. We demonstrate that, using this approach, PCR-induced mutations in full-length clones derived from their cognate single genome amplicons are rare. Furthermore, all eight non-transmitted genomes tested produced functional virus with a range of infectivities, belying the previous assumption that a majority of circulating viruses in chronic HIV-1 infection are defective. Thus, these methods provide important tools to update protocols in molecular biology that can be universally applied to the study of human viral pathogens. - Highlights: • Our novel methodology demonstrates accurate amplification and cloning of full-length HIV-1 genomes. • A majority of plasma derived HIV variants from a chronically infected individual are infectious. • The transmitted/founder was more infectious than the majority of the variants from the chronically infected donor

  2. Use of Four Next-Generation Sequencing Platforms to Determine HIV-1 Coreceptor Tropism

    Archer, J.; Weber, Jan; Henry, K.; Winner, D.; Gibson, R.; Lee, L.; Paxinos, E.; Arts, E. J.; Robertson, D. L.; Mimms, L.; Quinones-Mateu, M. E.

    2012-01-01

    Roč. 7, č. 11 (2012), e49602/1-e49602/17 E-ISSN 1932-6203 R&D Projects: GA MŠk(CZ) LK11207 Institutional research plan: CEZ:AV0Z40550506 Keywords : HIV-1 tropism * V3 region * deep sequencing Subject RIV: EE - Microbiology, Virology Impact factor: 3.730, year: 2012 http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0049602

  3. Opening of the TAR hairpin in the HIV-1 genome causes aberrant RNA dimerization and packaging

    Das Atze T

    2012-07-01

    Full Text Available Abstract Background The TAR hairpin is present at both the 5′ and 3′ end of the HIV-1 RNA genome. The 5′ element binds the viral Tat protein and is essential for Tat-mediated activation of transcription. We recently observed that complete TAR deletion is allowed in the context of an HIV-1 variant that does not depend on this Tat-TAR axis for transcription. Mutations that open the 5′ stem-loop structure did however affect the leader RNA conformation and resulted in a severe replication defect. In this study, we set out to analyze which step of the HIV-1 replication cycle is affected by this conformational change of the leader RNA. Results We demonstrate that opening the 5′ TAR structure through a deletion in either side of the stem region caused aberrant dimerization and reduced packaging of the unspliced viral RNA genome. In contrast, truncation of the TAR hairpin through deletions in both sides of the stem did not affect RNA dimer formation and packaging. Conclusions These results demonstrate that, although the TAR hairpin is not essential for RNA dimerization and packaging, mutations in TAR can significantly affect these processes through misfolding of the relevant RNA signals.

  4. Comprehensive sieve analysis of breakthrough HIV-1 sequences in the RV144 vaccine efficacy trial.

    Paul T Edlefsen

    2015-02-01

    Full Text Available The RV144 clinical trial showed the partial efficacy of a vaccine regimen with an estimated vaccine efficacy (VE of 31% for protecting low-risk Thai volunteers against acquisition of HIV-1. The impact of vaccine-induced immune responses can be investigated through sieve analysis of HIV-1 breakthrough infections (infected vaccine and placebo recipients. A V1/V2-targeted comparison of the genomes of HIV-1 breakthrough viruses identified two V2 amino acid sites that differed between the vaccine and placebo groups. Here we extended the V1/V2 analysis to the entire HIV-1 genome using an array of methods based on individual sites, k-mers and genes/proteins. We identified 56 amino acid sites or "signatures" and 119 k-mers that differed between the vaccine and placebo groups. Of those, 19 sites and 38 k-mers were located in the regions comprising the RV144 vaccine (Env-gp120, Gag, and Pro. The nine signature sites in Env-gp120 were significantly enriched for known antibody-associated sites (p = 0.0021. In particular, site 317 in the third variable loop (V3 overlapped with a hotspot of antibody recognition, and sites 369 and 424 were linked to CD4 binding site neutralization. The identified signature sites significantly covaried with other sites across the genome (mean = 32.1 more than did non-signature sites (mean = 0.9 (p < 0.0001, suggesting functional and/or structural relevance of the signature sites. Since signature sites were not preferentially restricted to the vaccine immunogens and because most of the associations were insignificant following correction for multiple testing, we predict that few of the genetic differences are strongly linked to the RV144 vaccine-induced immune pressure. In addition to presenting results of the first complete-genome analysis of the breakthrough infections in the RV144 trial, this work describes a set of statistical methods and tools applicable to analysis of breakthrough infection genomes in general vaccine

  5. Mutation of HIV-1 Genomes in a Clinical Population Treated with the Mutagenic Nucleoside KP1461

    Mullins, James I.; Heath, Laura; Hughes, James P.; Kicha, Jessica; Styrchak, Sheila; Wong, Kim G.; Rao, Ushnal; Hansen, Alexis; Harris, Kevin S.; Laurent, Jean-Pierre; Li, Deyu; Simpson, Jeffrey H.; Essigmann, John M.; Loeb, Lawrence A.; Parkins, Jeffrey

    2011-01-01

    The deoxycytidine analog KP1212, and its prodrug KP1461, are prototypes of a new class of antiretroviral drugs designed to increase viral mutation rates, with the goal of eventually causing the collapse of the viral population. Here we present an extensive analysis of viral sequences from HIV-1 infected volunteers from the first "mechanism validation" phase II clinical trial of a mutagenic base analog in which individuals previously treated with antiviral drugs received 1600 mg of KP1461 twic...

  6. Yeast genome sequencing:

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  7. Identification of a novel splice acceptor in the HIV-1 genome: independent expression of the cytoplasmic tail of the envelope protein

    Berkhout, B.; van Wamel, J. L.

    1996-01-01

    Multiple splicing sites exist in the RNA genome of the human immunodeficiency virus type 1 (HIV-1). In a screen for subgenomic forms of the HIV-1 genome that could be transferred to fresh cells by virus infection, we identified a novel spliced variant of HIV-1 RNA that uses a hitherto unknown splice

  8. Natural selection among Eurasians at genomic regions associated with HIV-1 control

    Allison David B

    2011-06-01

    Full Text Available Abstract Background HIV susceptibility and pathogenicity exhibit both interindividual and intergroup variability. The etiology of intergroup variability is still poorly understood, and could be partly linked to genetic differences among racial/ethnic groups. These genetic differences may be traceable to different regimes of natural selection in the 60,000 years since the human radiation out of Africa. Here, we examine population differentiation and haplotype patterns at several loci identified through genome-wide association studies on HIV-1 control, as determined by viral-load setpoint, in European and African-American populations. We use genome-wide data from the Human Genome Diversity Project, consisting of 53 world-wide populations, to compare measures of FST and relative extended haplotype homozygosity (REHH at these candidate loci to the rest of the respective chromosome. Results We find that the Europe-Middle East and Europe-South Asia pairwise FST in the most strongly associated region are elevated compared to most pairwise comparisons with the sub-Saharan African group, which exhibit very low FST. We also find genetic signatures of recent positive selection (higher REHH at these associated regions among all groups except for sub-Saharan Africans and Native Americans. This pattern is consistent with one in which genetic differentiation, possibly due to diversifying/positive selection, occurred at these loci among Eurasians. Conclusions These findings are concordant with those from earlier studies suggesting recent evolutionary change at immunity-related genomic regions among Europeans, and shed light on the potential genetic and evolutionary origin of population differences in HIV-1 control.

  9. Improved therapy-success prediction with GSS estimated from clinical HIV-1 sequences.

    Pironti, Alejandro; Pfeifer, Nico; Kaiser, Rolf; Walter, Hauke; Lengauer, Thomas

    2014-01-01

    Rules-based HIV-1 drug-resistance interpretation (DRI) systems disregard many amino-acid positions of the drug's target protein. The aims of this study are (1) the development of a drug-resistance interpretation system that is based on HIV-1 sequences from clinical practice rather than hard-to-get phenotypes, and (2) the assessment of the benefit of taking all available amino-acid positions into account for DRI. A dataset containing 34,934 therapy-naïve and 30,520 drug-exposed HIV-1 pol sequences with treatment history was extracted from the EuResist database and the Los Alamos National Laboratory database. 2,550 therapy-change-episode baseline sequences (TCEB) were assigned to test set A. Test set B contains 1,084 TCEB from the HIVdb TCE repository. Sequences from patients absent in the test sets were used to train three linear support vector machines to produce scores that predict drug exposure pertaining to each of 20 antiretrovirals: the first one uses the full amino-acid sequences (DEfull), the second one only considers IAS drug-resistance positions (DEonlyIAS), and the third one disregards IAS drug-resistance positions (DEnoIAS). For performance comparison, test sets A and B were evaluated with DEfull, DEnoIAS, DEonlyIAS, geno2pheno[resistance], HIVdb, ANRS, HIV-GRADE, and REGA. Clinically-validated cut-offs were used to convert the continuous output of the first four methods into susceptible-intermediate-resistant (SIR) predictions. With each method, a genetic susceptibility score (GSS) was calculated for each therapy episode in each test set by converting the SIR prediction for its compounds to integer: S=2, I=1, and R=0. The GSS were used to predict therapy success as defined by the EuResist standard datum definition. Statistical significance was assessed using a Wilcoxon signed-rank test. A comparison of the therapy-success prediction performances among the different interpretation systems for test set A can be found in Table 1, while those for test set

  10. High-resolution deep sequencing reveals biodiversity, population structure, and persistence of HIV-1 quasispecies within host ecosystems

    Yin Li

    2012-12-01

    Full Text Available Abstract Background Deep sequencing provides the basis for analysis of biodiversity of taxonomically similar organisms in an environment. While extensively applied to microbiome studies, population genetics studies of viruses are limited. To define the scope of HIV-1 population biodiversity within infected individuals, a suite of phylogenetic and population genetic algorithms was applied to HIV-1 envelope hypervariable domain 3 (Env V3 within peripheral blood mononuclear cells from a group of perinatally HIV-1 subtype B infected, therapy-naïve children. Results Biodiversity of HIV-1 Env V3 quasispecies ranged from about 70 to 270 unique sequence clusters across individuals. Viral population structure was organized into a limited number of clusters that included the dominant variants combined with multiple clusters of low frequency variants. Next generation viral quasispecies evolved from low frequency variants at earlier time points through multiple non-synonymous changes in lineages within the evolutionary landscape. Minor V3 variants detected as long as four years after infection co-localized in phylogenetic reconstructions with early transmitting viruses or with subsequent plasma virus circulating two years later. Conclusions Deep sequencing defines HIV-1 population complexity and structure, reveals the ebb and flow of dominant and rare viral variants in the host ecosystem, and identifies an evolutionary record of low-frequency cell-associated viral V3 variants that persist for years. Bioinformatics pipeline developed for HIV-1 can be applied for biodiversity studies of virome populations in human, animal, or plant ecosystems.

  11. Cellular specificity of HIV-1 replication can be controlled by LTR sequences

    Reed-Inderbitzin, Edward; Maury, Wendy

    2003-01-01

    Two well-established determinants of retroviral tropism are envelope sequences that regulate entry and LTR sequences that can regulate viral expression in a cell-specific manner. Studies with human immunodeficiency virus-1 (HIV-1) have demonstrated that tropism of this virus maps primarily to variable envelope sequences. Studies have demonstrated that T cell and macrophage-specific transcription factor binding motifs exist in the upstream region of the LTR U3; however, the ability of the core enhancer/promoter proximal elements (two NF-κB and three Sp1 sites) to function well in macrophages and T cells have led many to conclude that HIV LTR sequences are not primary determinants of HIV tropism. To determine if cellular specificity could be imparted to HIV by the core enhancer elements, the enhancer/promoter proximal region of the HIV LTR was substituted with motifs that control gene expression in a myeloid-specific manner. The enhancer region from equine infectious anemia virus (EIAV) when substituted for the HIV enhancer/promoter proximal region was found to drive expression in a macrophage-specific manner and was responsive to HIV Tat. The addition of a 5' methylation-dependent binding site (MDBP) and a promoter proximal Sp1 motif increased expression without altering cellular specificity. Spacing between the promoter proximal region and the TATA box was also found to influence LTR activity. Infectivity studies using chimeric LTRs within the context of a dual-tropic infectious molecular clone established that these LTRs directed HIV replication and production of infectious virions in macrophages but not primary T cells or T cell lines. This investigation demonstrates that cellular specificity can be imparted onto HIV-1 replication at the level of viral transcription and not entry

  12. Evaluation of sequence ambiguities of the HIV-1 pol gene as a method to identify recent HIV-1 infection in transmitted drug resistance surveys.

    Andersson, Emmi; Shao, Wei; Bontell, Irene; Cham, Fatim; Cuong, Do Duy; Wondwossen, Amogne; Morris, Lynn; Hunt, Gillian; Sönnerborg, Anders; Bertagnolio, Silvia; Maldarelli, Frank; Jordan, Michael R

    2013-08-01

    Identification of recent HIV infection within populations is a public health priority for accurate estimation of HIV incidence rates and transmitted drug resistance at population level. Determining HIV incidence rates by prospective follow-up of HIV-uninfected individuals is challenging and serological assays have important limitations. HIV diversity within an infected host increases with duration of infection. We explore a simple bioinformatics approach to assess viral diversity by determining the percentage of ambiguous base calls in sequences derived from standard genotyping of HIV-1 protease and reverse transcriptase. Sequences from 691 recently infected (≤1 year) and chronically infected (>1 year) individuals from Sweden, Vietnam and Ethiopia were analyzed for ambiguity. A significant difference (p<0.0001) in the proportion of ambiguous bases was observed between sequences from individuals with recent and chronic infection in both HIV-1 subtype B and non-B infection, consistent with previous studies. In our analysis, a cutoff of <0.47% ambiguous base calls identified recent infection with a sensitivity and specificity of 88.8% and 74.6% respectively. 1,728 protease and reverse transcriptase sequences from 36 surveys of transmitted HIV drug resistance performed following World Health Organization guidance were analyzed for ambiguity. The 0.47% ambiguity cutoff was applied and survey sequences were classified as likely derived from recently or chronically infected individuals. 71% of patients were classified as likely to have been infected within one year of genotyping but results varied considerably amongst surveys. This bioinformatics approach may provide supporting population-level information to identify recent infection but its application is limited by infection with more than one viral variant, decreasing viral diversity in advanced disease and technical aspects of population based sequencing. Standardization of sequencing techniques and base calling

  13. Mutation of HIV-1 genomes in a clinical population treated with the mutagenic nucleoside KP1461.

    Mullins, James I; Heath, Laura; Hughes, James P; Kicha, Jessica; Styrchak, Sheila; Wong, Kim G; Rao, Ushnal; Hansen, Alexis; Harris, Kevin S; Laurent, Jean-Pierre; Li, Deyu; Simpson, Jeffrey H; Essigmann, John M; Loeb, Lawrence A; Parkins, Jeffrey

    2011-01-14

    The deoxycytidine analog KP1212, and its prodrug KP1461, are prototypes of a new class of antiretroviral drugs designed to increase viral mutation rates, with the goal of eventually causing the collapse of the viral population. Here we present an extensive analysis of viral sequences from HIV-1 infected volunteers from the first "mechanism validation" phase II clinical trial of a mutagenic base analog in which individuals previously treated with antiviral drugs received 1600 mg of KP1461 twice per day for 124 days. Plasma viral loads were not reduced, and overall levels of viral mutation were not increased during this short-term study, however, the mutation spectrum of HIV was altered. A large number (N = 105 per sample) of sequences were analyzed, each derived from individual HIV-1 RNA templates, after 0, 56 and 124 days of therapy from 10 treated and 10 untreated control individuals (>7.1 million base pairs of unique viral templates were sequenced). We found that private mutations, those not found in more than one viral sequence and likely to have occurred in the most recent rounds of replication, increased in treated individuals relative to controls after 56 (p = 0.038) and 124 (p = 0.002) days of drug treatment. The spectrum of mutations observed in the treated group showed an excess of A to G and G to A mutations (p = 0.01), and to a lesser extent T to C and C to T mutations (p = 0.09), as predicted by the mechanism of action of the drug. These results validate the proposed mechanism of action in humans and should spur development of this novel antiretroviral approach.

  14. Mutation of HIV-1 genomes in a clinical population treated with the mutagenic nucleoside KP1461.

    James I Mullins

    2011-01-01

    Full Text Available The deoxycytidine analog KP1212, and its prodrug KP1461, are prototypes of a new class of antiretroviral drugs designed to increase viral mutation rates, with the goal of eventually causing the collapse of the viral population. Here we present an extensive analysis of viral sequences from HIV-1 infected volunteers from the first "mechanism validation" phase II clinical trial of a mutagenic base analog in which individuals previously treated with antiviral drugs received 1600 mg of KP1461 twice per day for 124 days. Plasma viral loads were not reduced, and overall levels of viral mutation were not increased during this short-term study, however, the mutation spectrum of HIV was altered. A large number (N = 105 per sample of sequences were analyzed, each derived from individual HIV-1 RNA templates, after 0, 56 and 124 days of therapy from 10 treated and 10 untreated control individuals (>7.1 million base pairs of unique viral templates were sequenced. We found that private mutations, those not found in more than one viral sequence and likely to have occurred in the most recent rounds of replication, increased in treated individuals relative to controls after 56 (p = 0.038 and 124 (p = 0.002 days of drug treatment. The spectrum of mutations observed in the treated group showed an excess of A to G and G to A mutations (p = 0.01, and to a lesser extent T to C and C to T mutations (p = 0.09, as predicted by the mechanism of action of the drug. These results validate the proposed mechanism of action in humans and should spur development of this novel antiretroviral approach.

  15. Production of HIV-1 vif mRNA Is Modulated by Natural Nucleotide Variations and SLSA1 RNA Structure in SA1D2prox Genomic Region

    Masako Nomaguchi

    2017-12-01

    Full Text Available Genomic RNA of HIV-1 contains localized structures critical for viral replication. Its structural analysis has demonstrated a stem-loop structure, SLSA1, in a nearby region of HIV-1 genomic splicing acceptor 1 (SA1. We have previously shown that the expression level of vif mRNA is considerably altered by some natural single-nucleotide variations (nSNVs clustering in SLSA1 structure. In this study, besides eleven nSNVs previously identified by us, we totally found nine new nSNVs in the SLSA1-containing sequence from SA1, splicing donor 2, and through to the start codon of Vif that significantly affect the vif mRNA level, and designated the sequence SA1D2prox (142 nucleotides for HIV-1 NL4-3. We then examined by extensive variant and mutagenesis analyses how SA1D2prox sequence and SLSA1 secondary structure are related to vif mRNA level. While the secondary structure and stability of SLSA1 was largely changed by nSNVs and artificial mutations introduced to restore the original NL4-3 form from altered ones by nSNVs, no clear association of the two SLSA1 properties with vif mRNA level was observed. In contrast, when naturally occurring SA1D2prox sequences that contain multiple nSNVs were examined, we attained significant inverse correlation between the vif level and SLSA1 stability. These results may suggest that SA1D2prox sequence adapts over time, and also that the altered SA1D2prox sequence, SLSA1 stability, and vif level are mutually related. In total, we show here that the entire SA1D2prox sequence and SLSA1 stability critically contribute to the modulation of vif mRNA level.

  16. The connection domain in reverse transcriptase facilitates the in vivo annealing of tRNALys3 to HIV-1 genomic RNA

    Niu Meijuan

    2004-10-01

    Full Text Available Abstract The primer tRNA for reverse transcription in HIV-1, tRNALys3, is selectively packaged into the virus during its assembly, and annealed to the viral genomic RNA. The ribonucleoprotein complex that is involved in the packaging and annealing of tRNALys into HIV-1 consists of Gag, GagPol, tRNALys, lysyl-tRNA synthetase (LysRS, and viral genomic RNA. Gag targets tRNALys for viral packaging through Gag's interaction with LysRS, a tRNALys-binding protein, while reverse transcriptase (RT sequences within GagPol (the thumb domain bind to tRNALys. The further annealing of tRNALys3 to viral RNA requires nucleocapsid (NC sequences in Gag, but not the NC sequences GagPol. In this report, we further show that while the RT connection domain in GagPol is not required for tRNALys3 packaging into the virus, it is required for tRNALys3 annealing to the viral RNA genome.

  17. Analysis of the initiating events in HIV-1 particle assembly and genome packaging.

    Sebla B Kutluay

    2010-11-01

    Full Text Available HIV-1 Gag drives a number of events during the genesis of virions and is the only viral protein required for the assembly of virus-like particles in vitro and in cells. Although a reasonable understanding of the processes that accompany the later stages of HIV-1 assembly has accrued, events that occur at the initiation of assembly are less well defined. In this regard, important uncertainties include where in the cell Gag first multimerizes and interacts with the viral RNA, and whether Gag-RNA interaction requires or induces Gag multimerization in a living cell. To address these questions, we developed assays in which protein crosslinking and RNA/protein co-immunoprecipitation were coupled with membrane flotation analyses in transfected or infected cells. We found that interaction between Gag and viral RNA occurred in the cytoplasm and was independent of the ability of Gag to localize to the plasma membrane. However, Gag:RNA binding was stabilized by the C-terminal domain (CTD of capsid (CA, which participates in Gag-Gag interactions. We also found that Gag was present as monomers and low-order multimers (e.g. dimers but did not form higher-order multimers in the cytoplasm. Rather, high-order multimers formed only at the plasma membrane and required the presence of a membrane-binding signal, but not a Gag domain (the CA-CTD that is essential for complete particle assembly. Finally, sequential RNA-immunoprecipitation assays indicated that at least a fraction of Gag molecules can form multimers on viral genomes in the cytoplasm. Taken together, our results suggest that HIV-1 particle assembly is initiated by the interaction between Gag and viral RNA in the cytoplasm and that this initial Gag-RNA encounter involves Gag monomers or low order multimers. These interactions per se do not induce or require high-order Gag multimerization in the cytoplasm. Instead, membrane interactions are necessary for higher order Gag multimerization and subsequent

  18. Genomic sequencing in clinical trials

    Mestan, Karen K; Ilkhanoff, Leonard; Mouli, Samdeep; Lin, Simon

    2011-01-01

    Abstract Human genome sequencing is the process by which the exact order of nucleic acid base pairs in the 24 human chromosomes is determined. Since the completion of the Human Genome Project in 2003, genomic sequencing is rapidly becoming a major part of our translational research efforts to understand and improve human health and disease. This article reviews the current and future directions of clinical research with respect to genomic sequencing, a technology that is just beginning to fin...

  19. Production of Mucosally Transmissible SHIV Challenge Stocks from HIV-1 Circulating Recombinant Form 01_AE env Sequences.

    Lawrence J Tartaglia

    2016-02-01

    Full Text Available Simian-human immunodeficiency virus (SHIV challenge stocks are critical for preclinical testing of vaccines, antibodies, and other interventions aimed to prevent HIV-1. A major unmet need for the field has been the lack of a SHIV challenge stock expressing circulating recombinant form 01_AE (CRF01_AE env sequences. We therefore sought to develop mucosally transmissible SHIV challenge stocks containing HIV-1 CRF01_AE env derived from acutely HIV-1 infected individuals from Thailand. SHIV-AE6, SHIV-AE6RM, and SHIV-AE16 contained env sequences that were >99% identical to the original HIV-1 isolate and did not require in vivo passaging. These viruses exhibited CCR5 tropism and displayed a tier 2 neutralization phenotype. These challenge stocks efficiently infected rhesus monkeys by the intrarectal route, replicated to high levels during acute infection, and established chronic viremia in a subset of animals. SHIV-AE16 was titrated for use in single, high dose as well as repetitive, low dose intrarectal challenge studies. These SHIV challenge stocks should facilitate the preclinical evaluation of vaccines, monoclonal antibodies, and other interventions targeted at preventing HIV-1 CRF01_AE infection.

  20. Study of HIV-1 subtypes in serodiscordant couples attending an integrated counselling and testing centre in Mumbai using heteroduplex mobility analysis and DNA sequencing

    Mehta P

    2010-01-01

    Full Text Available Aims: To determine the prevalent subtypes of HIV-1 in serodiscordant couples. Setting: Integrated Counselling and Testing Centre (ICTC, Department of Microbiology. Study Design: Prospective pilot study. Participants: Thirty HIV-1 serodiscordant couples. Inclusion Criteria: a Documentation of HIV-1 infection in one partner and seronegative status in the other, current history of continued unprotected sexual activity within the partnership, demonstration that they have been in a partnership for at least 1 year and are not currently on highly active antiretroviral therapy HAART; b willingness of both partners to provide written informed consent including consent to continued couple counselling for 3 months. Materials and Methods: HIV-1 subtyping was carried out by heteroduplex mobility analysis (HMA by amplifying env region; and DNA sequencing by amplifying gag region. Results: HIV-1 env gene was amplified successfully in 10/30 samples; gag gene, in 25/30 samples; and both env and gag gene were amplified successfully in 5/30 samples. HIV-1 subtype C was detected from 21 samples; subtype B, from 7; and subtype A, from 2. Sample from 1 positive partner was detected as subtype C by env HMA and subtype B by gag sequencing. Conclusion: HIV-1 subtype C was found to be the predominant subtype of HIV-1 in serodiscordant couples attending our ICTC, followed by HIV-1 subtype B and HIV-1 subtype A, respectively. DNA sequencing was found to be the most reliable method for determining the subtypes of HIV-1.

  1. Characterization of partial and near full-length genomes of HIV-1 strains sampled from recently infected individuals in São Paulo, Brazil.

    Sabri Saeed Sanabani

    Full Text Available BACKGROUND: Genetic variability is a major feature of human immunodeficiency virus type 1 (HIV-1 and is considered the key factor frustrating efforts to halt the HIV epidemic. A proper understanding of HIV-1 genomic diversity is a fundamental prerequisite for proper epidemiology, genetic diagnosis, and successful drugs and vaccines design. Here, we report on the partial and near full-length genomic (NFLG variability of HIV-1 isolates from a well-characterized cohort of recently infected patients in São Paul, Brazil. METHODOLOGY: HIV-1 proviral DNA was extracted from the peripheral blood mononuclear cells of 113 participants. The NFLG and partial fragments were determined by overlapping nested PCR and direct sequencing. The data were phylogenetically analyzed. RESULTS: Of the 113 samples (90.3% male; median age 31 years; 79.6% homosexual men studied, 77 (68.1% NFLGs and 32 (29.3% partial fragments were successfully subtyped. Of the successfully subtyped sequences, 88 (80.7% were subtype B sequences, 12 (11% BF1 recombinants, 3 (2.8% subtype C sequences, 2 (1.8% BC recombinants and subclade F1 each, 1 (0.9% CRF02 AG, and 1 (0.9% CRF31 BC. Primary drug resistance mutations were observed in 14/101 (13.9% of samples, with 5.9% being resistant to protease inhibitors and nucleoside reverse transcriptase inhibitors (NRTI and 4.9% resistant to non-NRTIs. Predictions of viral tropism were determined for 86 individuals. X4 or X4 dual or mixed-tropic viruses (X4/DM were seen in 26 (30.2% of subjects. The proportion of X4 viruses in homosexuals was detected in 19/69 (27.5%. CONCLUSIONS: Our results confirm the existence of various HIV-1 subtypes circulating in São Paulo, and indicate that subtype B account for the majority of infections. Antiretroviral (ARV drug resistance is relatively common among recently infected patients. The proportion of X4 viruses in homosexuals was significantly higher than the proportion seen in other study populations.

  2. Genome Sequence Databases (Overview): Sequencing and Assembly

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  3. Genome Sequences of Oryza Species

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  4. Genome Sequences of Oryza Species

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  5. The C-terminal sequence of IFITM1 regulates its anti-HIV-1 activity.

    Rui Jia

    Full Text Available The interferon-inducible transmembrane (IFITM proteins inhibit a wide range of viruses. We previously reported the inhibition of human immunodeficiency virus type 1 (HIV-1 strain BH10 by human IFITM1, 2 and 3. It is unknown whether other HIV-1 strains are similarly inhibited by IFITMs and whether there exists viral countermeasure to overcome IFITM inhibition. We report here that the HIV-1 NL4-3 strain (HIV-1NL4-3 is not restricted by IFITM1 and its viral envelope glycoprotein is partly responsible for this insensitivity. However, HIV-1NL4-3 is profoundly inhibited by an IFITM1 mutant, known as Δ(117-125, which is deleted of 9 amino acids at the C-terminus. In contrast to the wild type IFITM1, which does not affect HIV-1 entry, the Δ(117-125 mutant diminishes HIV-1NL4-3 entry by 3-fold. This inhibition correlates with the predominant localization of Δ(117-125 to the plasma membrane where HIV-1 entry occurs. In spite of strong conservation of IFITM1 among most species, mouse IFITM1 is 19 amino acids shorter at its C-terminus as compared to human IFITM1 and, like the human IFITM1 mutant Δ(117-125, mouse IFITM1 also inhibits HIV-1 entry. This is the first report illustrating the role of viral envelope protein in overcoming IFITM1 restriction. The results also demonstrate the importance of the C-terminal region of IFITM1 in modulating the antiviral function through controlling protein subcellular localization.

  6. High prevalence of HIV-1 transmitted drug-resistance mutations from proviral DNA massively parallel sequencing data of therapy-naïve chronically infected Brazilian blood donors.

    Rodrigo Pessôa

    Full Text Available An improved understanding of the prevalence of low-abundance transmitted drug-resistance mutations (TDRM in therapy-naïve HIV-1-infected patients may help determine which patients are the best candidates for therapy. In this study, we aimed to obtain a comprehensive picture of the evolving HIV-1 TDRM across the massive parallel sequences (MPS of the viral entire proviral genome in a well-characterized Brazilian blood donor naïve to antiretroviral drugs.The MPS data from 128 samples used in the analysis were sourced from Brazilian blood donors and were previously classified by less-sensitive (LS or "detuned" enzyme immunoassay as non-recent or longstanding HIV-1 infections. The Stanford HIV Resistance Database (HIVDBv 6.2 and IAS-USA mutation lists were used to interpret the pattern of drug resistance. The minority variants with TDRM were identified using a threshold of ≥ 1.0% and ≤ 20% of the reads sequenced. The rate of TDRM in the MPS data of the proviral genome were compared with the corresponding published consensus sequences of their plasma viruses.No TDRM were detected in the integrase or envelope regions. The overall prevalence of TDRM in the protease (PR and reverse transcriptase (RT regions of the HIV-1 pol gene was 44.5% (57/128, including any mutations to the nucleoside analogue reverse transcriptase inhibitors (NRTI and non-nucleoside analogue reverse transcriptase inhibitors (NNRTI. Of the 57 subjects, 43 (75.4% harbored a minority variant containing at least one clinically relevant TDRM. Among the 43 subjects, 33 (76.7% had detectable minority resistant variants to NRTIs, 6 (13.9% to NNRTIs, and 16 (37.2% to PR inhibitors. The comparison of viral sequences in both sources, plasma and cells, would have detected 48 DNA provirus disclosed TDRM by MPS previously missed by plasma bulk analysis.Our findings revealed a high prevalence of TDRM found in this group, as the use of MPS drastically increased the detection of these

  7. Targeted sequencing of plant genomes

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  8. Molecular mimicry of human tRNALys anti-codon domain by HIV-1 RNA genome facilitates tRNA primer annealing.

    Jones, Christopher P; Saadatmand, Jenan; Kleiman, Lawrence; Musier-Forsyth, Karin

    2013-02-01

    The primer for initiating reverse transcription in human immunodeficiency virus type 1 (HIV-1) is tRNA(Lys3). Host cell tRNA(Lys) is selectively packaged into HIV-1 through a specific interaction between the major tRNA(Lys)-binding protein, human lysyl-tRNA synthetase (hLysRS), and the viral proteins Gag and GagPol. Annealing of the tRNA primer onto the complementary primer-binding site (PBS) in viral RNA is mediated by the nucleocapsid domain of Gag. The mechanism by which tRNA(Lys3) is targeted to the PBS and released from hLysRS prior to annealing is unknown. Here, we show that hLysRS specifically binds to a tRNA anti-codon-like element (TLE) in the HIV-1 genome, which mimics the anti-codon loop of tRNA(Lys) and is located proximal to the PBS. Mutation of the U-rich sequence within the TLE attenuates binding of hLysRS in vitro and reduces the amount of annealed tRNA(Lys3) in virions. Thus, LysRS binds specifically to the TLE, which is part of a larger LysRS binding domain in the viral RNA that includes elements of the Psi packaging signal. Our results suggest that HIV-1 uses molecular mimicry of the anti-codon of tRNA(Lys) to increase the efficiency of tRNA(Lys3) annealing to viral RNA.

  9. Probing the HIV-1 genomic RNA trafficking pathway and dimerization by genetic recombination and single virion analyses.

    Michael D Moore

    2009-10-01

    Full Text Available Once transcribed, the nascent full-length RNA of HIV-1 must travel to the appropriate host cell sites to be translated or to find a partner RNA for copackaging to form newly generated viruses. In this report, we sought to delineate the location where HIV-1 RNA initiates dimerization and the influence of the RNA transport pathway used by the virus on downstream events essential to viral replication. Using a cell-fusion-dependent recombination assay, we demonstrate that the two RNAs destined for copackaging into the same virion select each other mostly within the cytoplasm. Moreover, by manipulating the RNA export element in the viral genome, we show that the export pathway taken is important for the ability of RNA molecules derived from two viruses to interact and be copackaged. These results further illustrate that at the point of dimerization the two main cellular export pathways are partially distinct. Lastly, by providing Gag in trans, we have demonstrated that Gag is able to package RNA from either export pathway, irrespective of the transport pathway used by the gag mRNA. These findings provide unique insights into the process of RNA export in general, and more specifically, of HIV-1 genomic RNA trafficking.

  10. The highly conserved codon following the slippery sequence supports -1 frameshift efficiency at the HIV-1 frameshift site.

    Suneeth F Mathew

    Full Text Available HIV-1 utilises -1 programmed ribosomal frameshifting to translate structural and enzymatic domains in a defined proportion required for replication. A slippery sequence, U UUU UUA, and a stem-loop are well-defined RNA features modulating -1 frameshifting in HIV-1. The GGG glycine codon immediately following the slippery sequence (the 'intercodon' contributes structurally to the start of the stem-loop but has no defined role in current models of the frameshift mechanism, as slippage is inferred to occur before the intercodon has reached the ribosomal decoding site. This GGG codon is highly conserved in natural isolates of HIV. When the natural intercodon was replaced with a stop codon two different decoding molecules-eRF1 protein or a cognate suppressor tRNA-were able to access and decode the intercodon prior to -1 frameshifting. This implies significant slippage occurs when the intercodon is in the (perhaps distorted ribosomal A site. We accommodate the influence of the intercodon in a model of frame maintenance versus frameshifting in HIV-1.

  11. The Sequenced Angiosperm Genomes and Genome Databases.

    Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

    2018-01-01

    Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.

  12. Genome-wide association study on the development of cross-reactive neutralizing antibodies in HIV-1 infected individuals.

    Zelda Euler

    Full Text Available Broadly neutralizing antibodies may protect against HIV-1 acquisition. In natural infection, only 10-30% of patients have cross-reactive neutralizing humoral immunity which may relate to viral and or host factors. To explore the role of host genetic markers in the formation of cross-reactive neutralizing activity (CrNA in HIV-1 infected individuals, we performed a genome-wide association study (GWAS, in participants of the Amsterdam Cohort Studies with known CrNA in their sera. Single-nucleotide polymorphisms (SNPs with the strongest P-values are located in the major histocompatibility complex (MHC region, close to MICA (P = 7.68 × 10(-7, HLA-B (P = 6.96 × 10(-6 and in the coding region of HCP5 (P = 1.34 × 10(-5. However, none of the signals reached genome-wide significance. Our findings underline the potential involvement of genes close or within the MHC region with the development of CrNA.

  13. Genome-Wide Association Study on the Development of Cross-Reactive Neutralizing Antibodies in HIV-1 Infected Individuals

    Euler, Zelda; van Gils, Marit J.; Boeser-Nunnink, Brigitte D.; Schuitemaker, Hanneke; van Manen, Daniëlle

    2013-01-01

    Broadly neutralizing antibodies may protect against HIV-1 acquisition. In natural infection, only 10–30% of patients have cross-reactive neutralizing humoral immunity which may relate to viral and or host factors. To explore the role of host genetic markers in the formation of cross-reactive neutralizing activity (CrNA) in HIV-1 infected individuals, we performed a genome-wide association study (GWAS), in participants of the Amsterdam Cohort Studies with known CrNA in their sera. Single-nucleotide polymorphisms (SNPs) with the strongest P-values are located in the major histocompatibility complex (MHC) region, close to MICA (P = 7.68×10−7), HLA-B (P = 6.96×10−6) and in the coding region of HCP5 (P = 1.34×10−5). However, none of the signals reached genome-wide significance. Our findings underline the potential involvement of genes close or within the MHC region with the development of CrNA. PMID:23372753

  14. A comparison of parallel pyrosequencing and sanger clone-based sequencing and its impact on the characterization of the genetic diversity of HIV-1.

    Binhua Liang

    Full Text Available BACKGROUND: Pyrosequencing technology has the potential to rapidly sequence HIV-1 viral quasispecies without requiring the traditional approach of cloning. In this study, we investigated the utility of ultra-deep pyrosequencing to characterize genetic diversity of the HIV-1 gag quasispecies and assessed the possible contribution of pyrosequencing technology in studying HIV-1 biology and evolution. METHODOLOGY/PRINCIPAL FINDINGS: HIV-1 gag gene was amplified from 96 patients using nested PCR. The PCR products were cloned and sequenced using capillary based Sanger fluorescent dideoxy termination sequencing. The same PCR products were also directly sequenced using the 454 pyrosequencing technology. The two sequencing methods were evaluated for their ability to characterize quasispecies variation, and to reveal sites under host immune pressure for their putative functional significance. A total of 14,034 variations were identified by 454 pyrosequencing versus 3,632 variations by Sanger clone-based (SCB sequencing. 11,050 of these variations were detected only by pyrosequencing. These undetected variations were located in the HIV-1 Gag region which is known to contain putative cytotoxic T lymphocyte (CTL and neutralizing antibody epitopes, and sites related to virus assembly and packaging. Analysis of the positively selected sites derived by the two sequencing methods identified several differences. All of them were located within the CTL epitope regions. CONCLUSIONS/SIGNIFICANCE: Ultra-deep pyrosequencing has proven to be a powerful tool for characterization of HIV-1 genetic diversity with enhanced sensitivity, efficiency, and accuracy. It also improved reliability of downstream evolutionary and functional analysis of HIV-1 quasispecies.

  15. Deep sequencing analysis of HIV-1 reverse transcriptase at baseline and time of failure in patients receiving rilpivirine in the phase III studies ECHO and THRIVE.

    Van Eygen, Veerle; Thys, Kim; Van Hove, Carl; Rimsky, Laurence T; De Meyer, Sandra; Aerssens, Jeroen; Picchio, Gaston; Vingerhoets, Johan

    2016-05-01

    Minority variants (1.0-25.0%) were evaluated by deep sequencing (DS) at baseline and virological failure (VF) in a selection of antiretroviral treatment-naïve, HIV-1-infected patients from the rilpivirine ECHO/THRIVE phase III studies. Linkage between frequently emerging resistance-associated mutations (RAMs) was determined. DS (llIumina®) and population sequencing (PS) results were available at baseline for 47 VFs and time of failure for 48 VFs; and at baseline for 49 responders matched for baseline characteristics. Minority mutations were accurately detected at frequencies down to 1.2% of the HIV-1 quasispecies. No baseline minority rilpivirine RAMs were detected in VFs; one responder carried 1.9% F227C. Baseline minority mutations associated with resistance to other non-nucleoside reverse transcriptase inhibitors (NNRTIs) were detected in 8/47 VFs (17.0%) and 7/49 responders (14.3%). Baseline minority nucleoside/nucleotide reverse transcriptase inhibitor (NRTI) RAMs M184V and L210W were each detected in one VF (none in responders). At failure, two patients without NNRTI RAMs by PS carried minority rilpivirine RAMs K101E and/or E138K; and five additional patients carried other minority NNRTI RAMs V90I, V106I, V179I, V189I, and Y188H. Overall at failure, minority NNRTI RAMs and NRTI RAMs were found in 29/48 (60.4%) and 16/48 VFs (33.3%), respectively. Linkage analysis showed that E138K and K101E were usually not observed on the same viral genome. In conclusion, baseline minority rilpivirine RAMs and other NNRTI/NRTI RAMs were uncommon in the rilpivirine arm of the ECHO and THRIVE studies. DS at failure showed emerging NNRTI resistant minority variants in seven rilpivirine VFs who had no detectable NNRTI RAMs by PS. © 2015 Wiley Periodicals, Inc.

  16. Sequence requirements of the HIV-1 protease flap region determined by saturation mutagenesis and kinetic analysis of flap mutants

    Shao, Wei; Everitt, Lorraine; Manchester, Marianne; Loeb, Daniel D.; Hutchison, Clyde A.; Swanstrom, Ronald

    1997-01-01

    The retroviral proteases (PRs) have a structural feature called the flap, which consists of a short antiparallel β-sheet with a turn. The flap extends over the substrate binding cleft and must be flexible to allow entry and exit of the polypeptide substrates and products. We analyzed the sequence requirements of the amino acids within the flap region (positions 46–56) of the HIV-1 PR. The phenotypes of 131 substitution mutants were determined using a bacterial expression system. Four of the mutant PRs with mutations in different regions of the flap were selected for kinetic analysis. Our phenotypic analysis, considered in the context of published structures of the HIV-1 PR with a bound substrate analogs, shows that: (i) Met-46 and Phe-53 participate in hydrophobic interactions on the solvent-exposed face of the flap; (ii) Ile-47, Ile-54, and Val-56 participate in hydrophobic interactions on the inner face of the flap; (iii) Ile-50 has hydrophobic interactions at the distance of both the δ and γ carbons; (iv) the three glycine residues in the β-turn of the flap are virtually intolerant of substitutions. Among these mutant PRs, we have identified changes in both kcat and Km. These results establish the nature of the side chain requirements at each position in the flap and document a role for the flap in both substrate binding and catalysis. PMID:9122179

  17. Phylogenetic analysis of HIV-1 reverse transcriptase sequences from 382 patients recruited in JJ Hospital of Mumbai, India, between 2002 and 2008.

    Deshpande, Alaka; Jauvin, Valerie; Pinson, Patricia; Jeannot, Anne Cecile; Fleury, Herve J

    2009-06-01

    Analysis of reverse transcriptase (RT) sequences of 382 HIV-1 isolates from untreated and treated patients recruited in JJ Hospital (Mumbai, India) between 2002 and 2008 shows that subtype C is largely predominant (98%) and that non-C sequences cluster with A1, B, CRF01_AE, and CRF06_cpx.

  18. Genotypic Resistance Tests Sequences Reveal the Role of Marginalized Populations in HIV-1 Transmission in Switzerland.

    Shilaih, Mohaned; Marzel, Alex; Yang, Wan Lin; Scherrer, Alexandra U; Schüpbach, Jörg; Böni, Jürg; Yerly, Sabine; Hirsch, Hans H; Aubert, Vincent; Cavassini, Matthias; Klimkait, Thomas; Vernazza, Pietro L; Bernasconi, Enos; Furrer, Hansjakob; Günthard, Huldrych F; Kouyos, Roger

    2016-06-14

    Targeting hard-to-reach/marginalized populations is essential for preventing HIV-transmission. A unique opportunity to identify such populations in Switzerland is provided by a database of all genotypic-resistance-tests from Switzerland, including both sequences from the Swiss HIV Cohort Study (SHCS) and non-cohort sequences. A phylogenetic tree was built using 11,127 SHCS and 2,875 Swiss non-SHCS sequences. Demographics were imputed for non-SHCS patients using a phylogenetic proximity approach. Factors associated with non-cohort outbreaks were determined using logistic regression. Non-B subtype (univariable odds-ratio (OR): 1.9; 95% confidence interval (CI): 1.8-2.1), female gender (OR: 1.6; 95% CI: 1.4-1.7), black ethnicity (OR: 1.9; 95% CI: 1.7-2.1) and heterosexual transmission group (OR:1.8; 95% CI: 1.6-2.0), were all associated with underrepresentation in the SHCS. We found 344 purely non-SHCS transmission clusters, however, these outbreaks were small (median 2, maximum 7 patients) with a strong overlap with the SHCS'. 65% of non-SHCS sequences were part of clusters composed of >= 50% SHCS sequences. Our data suggests that marginalized-populations are underrepresented in the SHCS. However, the limited size of outbreaks among non-SHCS patients in-care implies that no major HIV outbreak in Switzerland was missed by the SHCS surveillance. This study demonstrates the potential of sequence data to assess and extend the scope of infectious-disease surveillance.

  19. Sensitive non-radioactive detection of HIV-1

    Teglbjærg, Lars Stubbe; Nielsen, C; Hansen, J E

    1992-01-01

    This report describes the use of the polymerase chain reaction (PCR) for the non-radioactive detection of HIV-1 proviral genomic sequences in HIV-1 infected cells. We have developed a sensitive assay, using three different sets of nested primers and our results show that this method is superior...... to standard PCR for the detection of HIV-1 DNA. The assay described features the use of a simple and inexpensive sample preparation technique and a non-radioactive hybridization procedure for confirmation of results. To test the suitability of the assay for clinical purposes, we tested cell samples from 76...

  20. Genome-wide association study identifies single nucleotide polymorphism in DYRK1A associated with replication of HIV-1 in monocyte-derived macrophages.

    Sebastiaan M Bol

    2011-02-01

    Full Text Available HIV-1 infected macrophages play an important role in rendering resting T cells permissive for infection, in spreading HIV-1 to T cells, and in the pathogenesis of AIDS dementia. During highly active anti-retroviral treatment (HAART, macrophages keep producing virus because tissue penetration of antiretrovirals is suboptimal and the efficacy of some is reduced. Thus, to cure HIV-1 infection with antiretrovirals we will also need to efficiently inhibit viral replication in macrophages. The majority of the current drugs block the action of viral enzymes, whereas there is an abundance of yet unidentified host factors that could be targeted. We here present results from a genome-wide association study identifying novel genetic polymorphisms that affect in vitro HIV-1 replication in macrophages.Monocyte-derived macrophages from 393 blood donors were infected with HIV-1 and viral replication was determined using Gag p24 antigen levels. Genomic DNA from individuals with macrophages that had relatively low (n = 96 or high (n = 96 p24 production was used for SNP genotyping with the Illumina 610 Quad beadchip. A total of 494,656 SNPs that passed quality control were tested for association with HIV-1 replication in macrophages, using linear regression. We found a strong association between in vitro HIV-1 replication in monocyte-derived macrophages and SNP rs12483205 in DYRK1A (p = 2.16 × 10(-5. While the association was not genome-wide significant (p<1 × 10(-7, we could replicate this association using monocyte-derived macrophages from an independent group of 31 individuals (p = 0.0034. Combined analysis of the initial and replication cohort increased the strength of the association (p = 4.84 × 10(-6. In addition, we found this SNP to be associated with HIV-1 disease progression in vivo in two independent cohort studies (p = 0.035 and p = 0.0048.These findings suggest that the kinase DYRK1A is involved in the replication of HIV-1, in vitro in macrophages

  1. Genome-Wide Association Study Identifies Single Nucleotide Polymorphism in DYRK1A Associated with Replication of HIV-1 in Monocyte-Derived Macrophages

    Bol, Sebastiaan M.; Moerland, Perry D.; Limou, Sophie; van Remmerden, Yvonne; Coulonges, Cédric; van Manen, Daniëlle; Herbeck, Joshua T.; Fellay, Jacques; Sieberer, Margit; Sietzema, Jantine G.; van 't Slot, Ruben; Martinson, Jeremy; Zagury, Jean-François; Schuitemaker, Hanneke; van 't Wout, Angélique B.

    2011-01-01

    Background HIV-1 infected macrophages play an important role in rendering resting T cells permissive for infection, in spreading HIV-1 to T cells, and in the pathogenesis of AIDS dementia. During highly active anti-retroviral treatment (HAART), macrophages keep producing virus because tissue penetration of antiretrovirals is suboptimal and the efficacy of some is reduced. Thus, to cure HIV-1 infection with antiretrovirals we will also need to efficiently inhibit viral replication in macrophages. The majority of the current drugs block the action of viral enzymes, whereas there is an abundance of yet unidentified host factors that could be targeted. We here present results from a genome-wide association study identifying novel genetic polymorphisms that affect in vitro HIV-1 replication in macrophages. Methodology/Principal Findings Monocyte-derived macrophages from 393 blood donors were infected with HIV-1 and viral replication was determined using Gag p24 antigen levels. Genomic DNA from individuals with macrophages that had relatively low (n = 96) or high (n = 96) p24 production was used for SNP genotyping with the Illumina 610 Quad beadchip. A total of 494,656 SNPs that passed quality control were tested for association with HIV-1 replication in macrophages, using linear regression. We found a strong association between in vitro HIV-1 replication in monocyte-derived macrophages and SNP rs12483205 in DYRK1A (p = 2.16×10−5). While the association was not genome-wide significant (p<1×10−7), we could replicate this association using monocyte-derived macrophages from an independent group of 31 individuals (p = 0.0034). Combined analysis of the initial and replication cohort increased the strength of the association (p = 4.84×10−6). In addition, we found this SNP to be associated with HIV-1 disease progression in vivo in two independent cohort studies (p = 0.035 and p = 0.0048). Conclusions/Significance These findings suggest that

  2. Distinct binding interactions of HIV-1 Gag to Psi and non-Psi RNAs: implications for viral genomic RNA packaging.

    Webb, Joseph A; Jones, Christopher P; Parent, Leslie J; Rouzina, Ioulia; Musier-Forsyth, Karin

    2013-08-01

    Despite the vast excess of cellular RNAs, precisely two copies of viral genomic RNA (gRNA) are selectively packaged into new human immunodeficiency type 1 (HIV-1) particles via specific interactions between the HIV-1 Gag and the gRNA psi (ψ) packaging signal. Gag consists of the matrix (MA), capsid, nucleocapsid (NC), and p6 domains. Binding of the Gag NC domain to ψ is necessary for gRNA packaging, but the mechanism by which Gag selectively interacts with ψ is unclear. Here, we investigate the binding of NC and Gag variants to an RNA derived from ψ (Psi RNA), as well as to a non-ψ region (TARPolyA). Binding was measured as a function of salt to obtain the effective charge (Zeff) and nonelectrostatic (i.e., specific) component of binding, Kd(1M). Gag binds to Psi RNA with a dramatically reduced Kd(1M) and lower Zeff relative to TARPolyA. NC, GagΔMA, and a dimerization mutant of Gag bind TARPolyA with reduced Zeff relative to WT Gag. Mutations involving the NC zinc finger motifs of Gag or changes to the G-rich NC-binding regions of Psi RNA significantly reduce the nonelectrostatic component of binding, leading to an increase in Zeff. These results show that Gag interacts with gRNA using different binding modes; both the NC and MA domains are bound to RNA in the case of TARPolyA, whereas binding to Psi RNA involves only the NC domain. Taken together, these results suggest a novel mechanism for selective gRNA encapsidation.

  3. Prediction of HIV-1 coreceptor usage (tropism) by sequence analysis using a genotypic approach.

    Sierra, Saleta; Kaiser, Rolf; Lübke, Nadine; Thielen, Alexander; Schuelter, Eugen; Heger, Eva; Däumer, Martin; Reuter, Stefan; Esser, Stefan; Fätkenheuer, Gerd; Pfister, Herbert; Oette, Mark; Lengauer, Thomas

    2011-12-01

    Maraviroc (MVC) is the first licensed antiretroviral drug from the class of coreceptor antagonists. It binds to the host coreceptor CCR5, which is used by the majority of HIV strains in order to infect the human immune cells (Fig. 1). Other HIV isolates use a different coreceptor, the CXCR4. Which receptor is used, is determined in the virus by the Env protein (Fig. 2). Depending on the coreceptor used, the viruses are classified as R5 or X4, respectively. MVC binds to the CCR5 receptor inhibiting the entry of R5 viruses into the target cell. During the course of disease, X4 viruses may emerge and outgrow the R5 viruses. Determination of coreceptor usage (also called tropism) is therefore mandatory prior to administration of MVC, as demanded by EMA and FDA. The studies for MVC efficiency MOTIVATE, MERIT and 1029 have been performed with the Trofile assay from Monogram, San Francisco, U.S.A. This is a high quality assay based on sophisticated recombinant tests. The acceptance for this test for daily routine is rather low outside of the U.S.A., since the European physicians rather tend to work with decentralized expert laboratories, which also provide concomitant resistance testing. These laboratories have undergone several quality assurance evaluations, the last one being presented in 2011. For several years now, we have performed tropism determinations based on sequence analysis from the HIV env-V3 gene region (V3). This region carries enough information to perform a reliable prediction. The genotypic determination of coreceptor usage presents advantages such as: shorter turnover time (equivalent to resistance testing), lower costs, possibility to adapt the results to the patients' needs and possibility of analysing clinical samples with very low or even undetectable viral load (VL), particularly since the number of samples analysed with VL < 1000 copies/μl roughly increased in the last years (Fig. 3). The main steps for tropism testing (Fig. 4) demonstrated in

  4. Structure-Related Roles for the Conservation of the HIV-1 Fusion Peptide Sequence Revealed by Nuclear Magnetic Resonance.

    Serrano, Soraya; Huarte, Nerea; Rujas, Edurne; Andreu, David; Nieva, José L; Jiménez, María Angeles

    2017-10-17

    Despite extensive characterization of the human immunodeficiency virus type 1 (HIV-1) hydrophobic fusion peptide (FP), the structure-function relationships underlying its extraordinary degree of conservation remain poorly understood. Specifically, the fact that the tandem repeat of the FLGFLG tripeptide is absolutely conserved suggests that high hydrophobicity may not suffice to unleash FP function. Here, we have compared the nuclear magnetic resonance (NMR) structures adopted in nonpolar media by two FP surrogates, wtFP-tag and scrFP-tag, which had equal hydrophobicity but contained wild-type and scrambled core sequences LFLGFLG and FGLLGFL, respectively. In addition, these peptides were tagged at their C-termini with an epitope sequence that folded independently, thereby allowing Western blot detection without interfering with FP structure. We observed similar α-helical FP conformations for both specimens dissolved in the low-polarity medium 25% (v/v) 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP), but important differences in contact with micelles of the membrane mimetic dodecylphosphocholine (DPC). Thus, whereas wtFP-tag preserved a helix displaying a Gly-rich ridge, the scrambled sequence lost in great part the helical structure upon being solubilized in DPC. Western blot analyses further revealed the capacity of wtFP-tag to assemble trimers in membranes, whereas membrane oligomers were not observed in the case of the scrFP-tag sequence. We conclude that, beyond hydrophobicity, preserving sequence order is an important feature for defining the secondary structures and oligomeric states adopted by the HIV FP in membranes.

  5. New findings on the d(TGGGAG) sequence: Surprising anti-HIV-1 activity.

    Romanucci, Valeria; Zarrelli, Armando; Liekens, Sandra; Noppen, Sam; Pannecouque, Christophe; Di Fabio, Giovanni

    2018-02-10

    The biological relevance of tetramolecular G-quadruplexes especially as anti-HIV agents has been extensively reported in the literature over the last years. In the light of our recent results regarding the slow G-quadruplex folding kinetics of ODNs based on d(TGGGAG) sequence, here we report a systematic anti-HIV screening to investigate the impact of the G-quadruplex folding on their anti-HIV activity. In particular, varying the single stranded concentrations of ODNs, it has been tested a pool of ODN sample solutions with different G-quadruplex concentrations. The anti-HIV assays have been designed favouring the limited kinetics involved in the tetramolecular G4-association based on the d(TGGGAG) sequence. Aiming to determine the stoichiometry of G-quadruplex structures in the same experimental conditions of the anti-HIV assays, a native gel electrophoresis was performed. The gel confirmed the G-quadruplex formation for almost all sample solutions while showing the formation of high order G4 structures for the more concentrated ODNs solutions. The most significant result is the discovery of a potent anti-HIV activity of the G-quadruplex formed by the natural d(TGGGAG) sequence (IC 50  = 14 nM) that, until now, has been reported to be completely inactive against HIV infection. Copyright © 2018 Elsevier Masson SAS. All rights reserved.

  6. Measuring replication competent HIV-1: advances and challenges in defining the latent reservoir.

    Wang, Zheng; Simonetti, Francesco R; Siliciano, Robert F; Laird, Gregory M

    2018-02-13

    Antiretroviral therapy cannot cure HIV-1 infection due to the persistence of a small number of latently infected cells harboring replication-competent proviruses. Measuring persistent HIV-1 is challenging, as it consists of a mosaic population of defective and intact proviruses that can shift from a state of latency to active HIV-1 transcription. Due to this complexity, most of the current assays detect multiple categories of persistent HIV-1, leading to an overestimate of the true size of the latent reservoir. Here, we review the development of the viral outgrowth assay, the gold-standard quantification of replication-competent proviruses, and discuss the insights provided by full-length HIV-1 genome sequencing methods, which allowed us to unravel the composition of the proviral landscape. In this review, we provide a dissection of what defines HIV-1 persistence and we examine the unmet needs to measure the efficacy of interventions aimed at eliminating the HIV-1 reservoir.

  7. Sequencing intractable DNA to close microbial genomes.

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  8. Sequencing Intractable DNA to Close Microbial Genomes

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  9. Evaluation of the Roche prototype 454 HIV-1 ultradeep sequencing drug resistance assay in a routine diagnostic laboratory.

    Garcia-Diaz, A; Guerrero-Ramos, A; McCormick, A L; Macartney, M; Conibear, T; Johnson, M A; Haque, T; Webster, D P

    2013-10-01

    Studies have shown that low-frequency resistance mutations can influence treatment outcome. However, the lack of a standardized high-throughput assay has precluded their detection in clinical settings. To evaluate the performance of the Roche prototype 454 UDS HIV-1 drug resistance assay (UDS assay) in a routine diagnostic laboratory. 50 plasma samples, previously characterized by population sequencing and that had shown ≥1 resistance associated mutation (RAM), were retrospectively tested by the UDS assay, including 18 B and 32 non-B subtypes; viral loads between 114-1,806,407 cp/ml; drug-naive (n=27) and drug-experienced (n=23) individuals. The UDS assay was successful for 37/50 (74%) samples. It detected all RAMs found by population sequencing at frequencies above 20%. In addition, 39 low-frequency RAMs were exclusively detected by the UDS assay at frequencies below 20% in both drug-naïve (19/26, 73%) and drug-experienced (9/18, 50%) individuals. UDS results would lead to changes from susceptible to resistant to efavirenz (EFV) in one drug-naive individual with suboptimal response to an EFV-containing regimen and from susceptible to resistance to lamivudine (3TC) in one drug naïve subject who subsequently failed a 3TC-containing regimen and in a treatment experienced subject who had failed a 3TC-containing regimen. The UDS assay performed well across a wide range of subtypes and viral loads; it showed perfect agreement with population sequencing for all RAMs analyzed. In addition, the UDS assay detected additional mutations at frequencies below 20% which correlate with patients' treatment history and had in some cases important prognostic implications. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  11. Stress- and sequence-dependent release into the culture medium of HIV-1 Nef produced in Saccharomyces cerevisiae.

    Macreadie, I G; Castelli, L A; Lucantoni, A; Azad, A A

    1995-09-11

    We have produced human immunodeficiency virus type 1 (HIV-1) Nef (a myristylated 206-amino-acid protein) in Saccharomyces cerevisaie and shown that, while Nef is normally found as a predominantly intracellular protein, amounts up to 40 micrograms/ml of Nef are also released into the extracellular medium during stress. By electrophoretic (SDS-PAGE) analysis the extracellular Nef is indistinguishable from intracellular Nef. Conditions of stress that lead to the release of Nef include elevated levels of copper or magnesium ions or growth at elevated temperatures. This release appears to be dependent upon the N-terminal sequences of Nef, including the presence of a myristylation site. Our observations concerning Nef release in yeast suggest new ways in which the behaviour of Nef should be examined in order to gain further insights into the development of AIDS. If the release of Nef is important in the development of AIDS, our work reveals that Nef-associated symptoms may be reduced or delayed by reducing stresses, such as fevers.

  12. Exploration of the effect of sequence variations located inside the binding pocket of HIV-1 and HIV-2 proteases.

    Triki, Dhoha; Billot, Telli; Visseaux, Benoit; Descamps, Diane; Flatters, Delphine; Camproux, Anne-Claude; Regad, Leslie

    2018-04-10

    HIV-2 protease (PR2) is naturally resistant to most FDA (Food and Drug Administration)-approved HIV-1 protease inhibitors (PIs), a major antiretroviral class. In this study, we compared the PR1 and PR2 binding pockets extracted from structures complexed with 12 ligands. The comparison of PR1 and PR2 pocket properties showed that bound PR2 pockets were more hydrophobic with more oxygen atoms and fewer nitrogen atoms than PR1 pockets. The structural comparison of PR1 and PR2 pockets highlighted structural changes induced by their sequence variations and that were consistent with these property changes. Specifically, substitutions at residues 31, 46, and 82 induced structural changes in their main-chain atoms that could affect PI binding in PR2. In addition, the modelling of PR1 mutant structures containing V32I and L76M substitutions revealed a cooperative mechanism leading to structural deformation of flap-residue 45 that could modify PR2 flexibility. Our results suggest that substitutions in the PR1 and PR2 pockets can modify PI binding and flap flexibility, which could underlie PR2 resistance against PIs. These results provide new insights concerning the structural changes induced by PR1 and PR2 pocket variation changes, improving the understanding of the atomic mechanism of PR2 resistance to PIs.

  13. Snake Genome Sequencing: Results and Future Prospects.

    Kerkkamp, Harald M I; Kini, R Manjunatha; Pospelov, Alexey S; Vonk, Freek J; Henkel, Christiaan V; Richardson, Michael K

    2016-12-01

    Snake genome sequencing is in its infancy-very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  14. Snake Genome Sequencing: Results and Future Prospects

    Harald M. I. Kerkkamp

    2016-12-01

    Full Text Available Snake genome sequencing is in its infancy—very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  15. Nucleic acid amplification of HIV-1 integrase sequence subtypes CRF01_AE and B for development of HIV anti-integrase drug resistance genotyping assay

    Adlar, F. R.; Bela, B.

    2017-08-01

    To anticipate the potential use of anti-integrase drugs in Indonesia for treatment of HIV-1 infection, the development of a drug resistance genotyping assay for anti-integrase is crucial in identifying the genetic drug resistance profile of Indonesian HIV-1 strains. This experiment aimed to amplify a target region in the integrase gene of Indonesian HIV-1 subtypes CRF01_AE and B that contain genetic mutations known to confer resistance to anti-integrase drug. Eleven archived plasma samples from individuals living with HIV-1 were obtained from the Virology and Cancer Pathobiology Research Center for Health Service (VCPRC FKUI-RSCM) laboratory. One of the plasma samples contained HIV-1 subtype B, and the remaining plasma samples contained subtype CRF01_AE. The target regions for all samples were amplified through RT-PCR, with an annealing temperature of 55 °C, using the primer pair AE_POL 4086F and AE_POL 5232R that were designed by VCPRC FKUI-RSCM. The results of this experiment show that 18.2% (2/11) of the samples were successfully amplified using the one-step RT-PCR. While the primer pair was effective in amplifying the target region in the integrase gene sequence for subtype B (100%; 1/1), it had a low efficacy (10%, 1/10) for subtype CRF01_AE. In conclusion, the primer pair can be used to amplify the target region in Indonesian HIV-1 strain subtypes CRF01_AE and B. However, optimization of the PCR condition and an increased number of samples would help to determine an accurate representation of the efficacy of the primer pair.

  16. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

    Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R. Bridget; Waters, Laura; Tong, C. Y. William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J.

    2018-01-01

    Background & methods The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions

  17. Modification of a loop sequence between α-helices 6 and 7 of virus capsid (CA protein in a human immunodeficiency virus type 1 (HIV-1 derivative that has simian immunodeficiency virus (SIVmac239 vif and CA α-helices 4 and 5 loop improves replication in cynomolgus monkey cells

    Adachi Akio

    2009-08-01

    Full Text Available Abstract Background Human immunodeficiency virus type 1 (HIV-1 productively infects only humans and chimpanzees but not cynomolgus or rhesus monkeys while simian immunodeficiency virus isolated from macaque (SIVmac readily establishes infection in those monkeys. Several HIV-1 and SIVmac chimeric viruses have been constructed in order to develop an animal model for HIV-1 infection. Construction of an HIV-1 derivative which contains sequences of a SIVmac239 loop between α-helices 4 and 5 (L4/5 of capsid protein (CA and the entire SIVmac239 vif gene was previously reported. Although this chimeric virus could grow in cynomolgus monkey cells, it did so much more slowly than did SIVmac. It was also reported that intrinsic TRIM5α restricts the post-entry step of HIV-1 replication in rhesus and cynomolgus monkey cells, and we previously demonstrated that a single amino acid in a loop between α-helices 6 and 7 (L6/7 of HIV type 2 (HIV-2 CA determines the susceptibility of HIV-2 to cynomolgus monkey TRIM5α. Results In the study presented here, we replaced L6/7 of HIV-1 CA in addition to L4/5 and vif with the corresponding segments of SIVmac. The resultant HIV-1 derivatives showed enhanced replication capability in established T cell lines as well as in CD8+ cell-depleted primary peripheral blood mononuclear cells from cynomolgus monkey. Compared with the wild type HIV-1 particles, the viral particles produced from a chimeric HIV-1 genome with those two SIVmac loops were less able to saturate the intrinsic restriction in rhesus monkey cells. Conclusion We have succeeded in making the replication of simian-tropic HIV-1 in cynomolgus monkey cells more efficient by introducing into HIV-1 the L6/7 CA loop from SIVmac. It would be of interest to determine whether HIV-1 derivatives with SIVmac CA L4/5 and L6/7 can establish infection of cynomolgus monkeys in vivo.

  18. Genome sequence of Lactobacillus rhamnosus ATCC 8530.

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R; Ziola, Barry

    2012-02-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.

  19. Genome Sequence of Lactobacillus rhamnosus ATCC 8530

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.; Ziola, Barry

    2012-01-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.

  20. Value of a newly sequenced bacterial genome

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj

    2014-01-01

    and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses...... heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting...

  1. Human Genome Sequencing in Health and Disease

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  2. HIV-1 transmission patterns in antiretroviral therapy-naive, HIV-infected North Americans based on phylogenetic analysis by population level and ultra-deep DNA sequencing.

    Lisa L Ross

    Full Text Available Factors that contribute to the transmission of human immunodeficiency virus type 1 (HIV-1, especially drug-resistant HIV-1 variants remain a significant public health concern. In-depth phylogenetic analyses of viral sequences obtained in the screening phase from antiretroviral-naïve HIV-infected patients seeking enrollment in EPZ108859, a large open-label study in the USA, Canada and Puerto Rico (ClinicalTrials.gov NCT00440947 were examined for insights into the roles of drug resistance and epidemiological factors that could impact disease dissemination. Viral transmission clusters (VTCs were initially predicted from a phylogenetic analysis of population level HIV-1 pol sequences obtained from 690 antiretroviral-naïve subjects in 2007. Subsequently, the predicted VTCs were tested for robustness by ultra deep sequencing (UDS using pyrosequencing technology and further phylogenetic analyses. The demographic characteristics of clustered and non-clustered subjects were then compared. From 690 subjects, 69 were assigned to 1 of 30 VTCs, each containing 2 to 5 subjects. Race composition of VTCs were significantly more likely to be white (72% vs. 60%; p = 0.04. VTCs had fewer reverse transcriptase and major PI resistance mutations (9% vs. 24%; p = 0.002 than non-clustered sequences. Both men-who-have-sex-with-men (MSM (68% vs. 48%; p = 0.001 and Canadians (29% vs. 14%; p = 0.03 were significantly more frequent in VTCs than non-clustered sequences. Of the 515 subjects who initiated antiretroviral therapy, 33 experienced confirmed virologic failure through 144 weeks while only 3/33 were from VTCs. Fewer VTCs subjects (as compared to those with non-clustering virus had HIV-1 with resistance-associated mutations or experienced virologic failure during the course of the study. Our analysis shows specific geographical and drug resistance trends that correlate well with transmission clusters defined by HIV sequences of similarity

  3. Genome Sequencing and Analysis Conference IV

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  4. Comparison of 454 Ultra-Deep Sequencing and Allele-Specific Real-Time PCR with Regard to the Detection of Emerging Drug-Resistant Minor HIV-1 Variants after Antiretroviral Prophylaxis for Vertical Transmission.

    Andrea Hauser

    Full Text Available Pregnant HIV-infected women were screened for the development of HIV-1 drug resistance after implementation of a triple-antiretroviral transmission prophylaxis as recommended by the WHO in 2006. The study offered the opportunity to compare amplicon-based 454 ultra-deep sequencing (UDS and allele-specific real-time PCR (ASPCR for the detection of drug-resistant minor variants in the HIV-1 reverse transcriptase (RT.Plasma samples from 34 Tanzanian women were previously analysed by ASPCR for key resistance mutations in the viral RT selected by AZT, 3TC, and NVP (K70R, K103N, Y181C, M184V, T215Y/F. In this study, the RT region of the same samples was investigated by amplicon-based UDS for resistance mutations using the 454 GS FLX System.Drug-resistant HIV-variants were identified in 69% (20/29 of women by UDS and in 45% (13/29 by ASPCR. The absolute number of resistance mutations identified by UDS was twice that identified by ASPCR (45 vs 24. By UDS 14 of 24 ASPCR-detected resistance mutations were identified at the same position. The overall concordance between UDS and ASPCR was 61.0% (25/41. The proportions of variants quantified by UDS were approximately 2-3 times lower than by ASPCR. Amplicon generation from samples with viral loads below 20,000 copies/ml failed more frequently by UDS compared to ASPCR (limit of detection = 650 copies/ml, resulting in missing or insufficient sequence coverage.Both methods can provide useful information about drug-resistant minor HIV-1 variants. ASPCR has a higher sensitivity than UDS, but is restricted to single resistance mutations. In contrast, UDS is limited by its requirement for high viral loads to achieve sufficient sequence coverage, but the sequence information reveals the complete resistance patterns within the genomic region analysed. Improvements to the UDS limit of detection are in progress, and UDS could then facilitate monitoring of drug-resistant minor variants in the HIV-1 quasispecies.

  5. Genomic sequencing of Pleistocene cave bears

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  6. HIV-1 vaccines

    Excler, Jean-Louis; Robb, Merlin L; Kim, Jerome H

    2014-01-01

    The development of a safe and effective preventive HIV-1 vaccine remains a public health priority. Despite scientific difficulties and disappointing results, HIV-1 vaccine clinical development has, for the first time, established proof-of-concept efficacy against HIV-1 acquisition and identified vaccine-associated immune correlates of risk. The correlate of risk analysis showed that IgG antibodies against the gp120 V2 loop correlated with decreased risk of HIV infection, while Env-specific IgA directly correlated with increased risk. The development of vaccine strategies such as improved envelope proteins formulated with potent adjuvants and DNA and vectors expressing mosaics, or conserved sequences, capable of eliciting greater breadth and depth of potentially relevant immune responses including neutralizing and non-neutralizing antibodies, CD4+ and CD8+ cell-mediated immune responses, mucosal immune responses, and immunological memory, is now proceeding quickly. Additional human efficacy trials combined with other prevention modalities along with sustained funding and international collaboration remain key to bring an HIV-1 vaccine to licensure. PMID:24637946

  7. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  8. An integrated genetic data environment (GDE)-based LINUX interface for analysis of HIV-1 and other microbial sequences.

    De Oliveira, T; Miller, R; Tarin, M; Cassol, S

    2003-01-01

    Sequence databases encode a wealth of information needed to develop improved vaccination and treatment strategies for the control of HIV and other important pathogens. To facilitate effective utilization of these datasets, we developed a user-friendly GDE-based LINUX interface that reduces input/output file formatting. GDE was adapted to the Linux operating system, bioinformatics tools were integrated with microbe-specific databases, and up-to-date GDE menus were developed for several clinically important viral, bacterial and parasitic genomes. Each microbial interface was designed for local access and contains Genbank, BLAST-formatted and phylogenetic databases. GDE-Linux is available for research purposes by direct application to the corresponding author. Application-specific menus and support files can be downloaded from (http://www.bioafrica.net).

  9. The characterization of twenty sequenced human genomes.

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  10. The allosteric HIV-1 integrase inhibitor BI-D affects virion maturation but does not influence packaging of a functional RNA genome.

    Nikki van Bel

    Full Text Available The viral integrase (IN is an essential protein for HIV-1 replication. IN inserts the viral dsDNA into the host chromosome, thereby aided by the cellular co-factor LEDGF/p75. Recently a new class of integrase inhibitors was described: allosteric IN inhibitors (ALLINIs. Although designed to interfere with the IN-LEDGF/p75 interaction to block HIV DNA integration during the early phase of HIV-1 replication, the major impact was surprisingly found on the process of virus maturation during the late phase, causing a reverse transcription defect upon infection of target cells. Virus particles produced in the presence of an ALLINI are misformed with the ribonucleoprotein located outside the virus core. Virus assembly and maturation are highly orchestrated and regulated processes in which several viral proteins and RNA molecules closely interact. It is therefore of interest to study whether ALLINIs have unpredicted pleiotropic effects on these RNA-related processes. We confirm that the ALLINI BI-D inhibits virus replication and that the produced virus is non-infectious. Furthermore, we show that the wild-type level of HIV-1 genomic RNA is packaged in virions and these genomes are in a dimeric state. The tRNAlys3 primer for reverse transcription was properly placed on this genomic RNA and could be extended ex vivo. In addition, the packaged reverse transcriptase enzyme was fully active when extracted from virions. As the RNA and enzyme components for reverse transcription are properly present in virions produced in the presence of BI-D, the inhibition of reverse transcription is likely to reflect the mislocalization of the components in the aberrant virus particle.

  11. The allosteric HIV-1 integrase inhibitor BI-D affects virion maturation but does not influence packaging of a functional RNA genome.

    van Bel, Nikki; van der Velden, Yme; Bonnard, Damien; Le Rouzic, Erwann; Das, Atze T; Benarous, Richard; Berkhout, Ben

    2014-01-01

    The viral integrase (IN) is an essential protein for HIV-1 replication. IN inserts the viral dsDNA into the host chromosome, thereby aided by the cellular co-factor LEDGF/p75. Recently a new class of integrase inhibitors was described: allosteric IN inhibitors (ALLINIs). Although designed to interfere with the IN-LEDGF/p75 interaction to block HIV DNA integration during the early phase of HIV-1 replication, the major impact was surprisingly found on the process of virus maturation during the late phase, causing a reverse transcription defect upon infection of target cells. Virus particles produced in the presence of an ALLINI are misformed with the ribonucleoprotein located outside the virus core. Virus assembly and maturation are highly orchestrated and regulated processes in which several viral proteins and RNA molecules closely interact. It is therefore of interest to study whether ALLINIs have unpredicted pleiotropic effects on these RNA-related processes. We confirm that the ALLINI BI-D inhibits virus replication and that the produced virus is non-infectious. Furthermore, we show that the wild-type level of HIV-1 genomic RNA is packaged in virions and these genomes are in a dimeric state. The tRNAlys3 primer for reverse transcription was properly placed on this genomic RNA and could be extended ex vivo. In addition, the packaged reverse transcriptase enzyme was fully active when extracted from virions. As the RNA and enzyme components for reverse transcription are properly present in virions produced in the presence of BI-D, the inhibition of reverse transcription is likely to reflect the mislocalization of the components in the aberrant virus particle.

  12. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  13. Multiple Genome Sequences of Lactobacillus plantarum Strains

    Kafka, Thomas A.; Geissler, Andreas J.; Vogel, Rudi F.

    2017-01-01

    ABSTRACT We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses.

  14. Complete Genome Sequence of Staphylococcus epidermidis 1457.

    Galac, Madeline R; Stam, Jason; Maybank, Rosslyn; Hinkle, Mary; Mack, Dietrich; Rohde, Holger; Roth, Amanda L; Fey, Paul D

    2017-06-01

    Staphylococcus epidermidis 1457 is a frequently utilized strain that is amenable to genetic manipulation and has been widely used for biofilm-related research. We report here the whole-genome sequence of this strain, which encodes 2,277 protein-coding genes and 81 RNAs within its 2.4-Mb genome and plasmid. Copyright © 2017 Galac et al.

  15. Comparison of 61 Sequenced Escherichia coli Genomes

    Lukjancenko, Oksana; Wassenaar, T. M.; Ussery, David

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics...

  16. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    Larsen, Mette Voldby; Cosentino, Salvatore; Rasmussen, Simon

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS...

  17. Sequencing and comparing whole mitochondrial genomes ofanimals

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  18. [Complete genome sequencing and sequence analysis of BCG Tice].

    Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli

    2012-10-04

    The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.

  19. Dynamics of HIV-1 RNA Near the Plasma Membrane during Virus Assembly.

    Sardo, Luca; Hatch, Steven C; Chen, Jianbo; Nikolaitchik, Olga; Burdick, Ryan C; Chen, De; Westlake, Christopher J; Lockett, Stephen; Pathak, Vinay K; Hu, Wei-Shau

    2015-11-01

    To increase our understanding of the events that lead to HIV-1 genome packaging, we examined the dynamics of viral RNA and Gag-RNA interactions near the plasma membrane by using total internal reflection fluorescence microscopy. We labeled HIV-1 RNA with a photoconvertible Eos protein via an RNA-binding protein that recognizes stem-loop sequences engineered into the viral genome. Near-UV light exposure causes an irreversible structural change in Eos and alters its emitted fluorescence from green to red. We studied the dynamics of HIV-1 RNA by photoconverting Eos near the plasma membrane, and we monitored the population of photoconverted red-Eos-labeled RNA signals over time. We found that in the absence of Gag, most of the HIV-1 RNAs stayed near the plasma membrane transiently, for a few minutes. The presence of Gag significantly increased the time that RNAs stayed near the plasma membrane: most of the RNAs were still detected after 30 min. We then quantified the proportion of HIV-1 RNAs near the plasma membrane that were packaged into assembling viral complexes. By tagging Gag with blue fluorescent protein, we observed that only a portion, ∼13 to 34%, of the HIV-1 RNAs that reached the membrane were recruited into assembling particles in an hour, and the frequency of HIV-1 RNA packaging varied with the Gag expression level. Our studies reveal the HIV-1 RNA dynamics on the plasma membrane and the efficiency of RNA recruitment and provide insights into the events leading to the generation of infectious HIV-1 virions. Nascent HIV-1 particles assemble on plasma membranes. During the assembly process, HIV-1 RNA genomes must be encapsidated into viral complexes to generate infectious particles. To gain insights into the RNA packaging and virus assembly mechanisms, we labeled and monitored the HIV-1 RNA signals near the plasma membrane. Our results showed that most of the HIV-1 RNAs stayed near the plasma membrane for only a few minutes in the absence of Gag, whereas

  20. Cyprinus carpio Genome sequencing and assembly

    Kolder, I.C.R.M.; Plas-Duivesteijn, van der Suzanne J.; Tan, G.; Wiegertjes, G.; Forlenza, M.; Guler, A.T.; Travin, D.Y.; Nakao, M.; Moritomo, T.; Irnazarow, I.; Jansen, H.J.

    2013-01-01

    Sequencing of the common carp (Cyprinus carpio carpio Linnaeus, 1758) genome, with the objective of establishing carp as a model organism to supplement the closely related zebrafish (Danio rerio). The sequenced individual is a homozygous female (by gynogenesis) of R3 x R8 carp, the heterozygous

  1. Harnessing Whole Genome Sequencing in Medical Mycology.

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  2. 10KP: A phylodiverse genome sequencing plan

    Cheng, Shifeng; Melkonian, Michael; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun

    2018-01-01

    Abstract Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here. PMID:29618049

  3. 10KP: A phylodiverse genome sequencing plan.

    Cheng, Shifeng; Melkonian, Michael; Smith, Stephen A; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Li, Fay-Wei; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun; Wong, Gane Ka-Shu

    2018-03-01

    Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here.

  4. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  5. Frequent intra-subtype recombination among HIV-1 circulating in Tanzania.

    Ireen E Kiwelu

    Full Text Available The study estimated the prevalence of HIV-1 intra-subtype recombinant variants among female bar and hotel workers in Tanzania. While intra-subtype recombination occurs in HIV-1, it is generally underestimated. HIV-1 env gp120 V1-C5 quasispecies from 45 subjects were generated by single-genome amplification and sequencing (median (IQR of 38 (28-50 sequences per subject. Recombination analysis was performed using seven methods implemented within the recombination detection program version 3, RDP3. HIV-1 sequences were considered recombinant if recombination signals were detected by at least three methods with p-values of ≤0.05 after Bonferroni correction for multiple comparisons. HIV-1 in 38 (84% subjects showed evidence for intra-subtype recombination including 22 with HIV-1 subtype A1, 13 with HIV-1 subtype C, and 3 with HIV-1 subtype D. The distribution of intra-patient recombination breakpoints suggested ongoing recombination and showed selective enrichment of recombinant variants in 23 (60% subjects. The number of subjects with evidence of intra-subtype recombination increased from 29 (69% to 36 (82% over one year of follow-up, although the increase did not reach statistical significance. Adjustment for intra-subtype recombination is important for the analysis of multiplicity of HIV infection. This is the first report of high prevalence of intra-subtype recombination in the HIV/AIDS epidemic in Tanzania, a region where multiple HIV-1 subtypes co-circulate. HIV-1 intra-subtype recombination increases viral diversity and presents additional challenges for HIV-1 vaccine design.

  6. Effects of sequence changes in the HIV-1 gp41 fusion peptide on CCR5 inhibitor resistance

    Anastassopoulou, Cleo G.; Ketas, Thomas J.; Sanders, Rogier W.; Johan Klasse, Per; Moore, John P.

    2012-01-01

    A rare pathway of HIV-1 resistance to small molecule CCR5 inhibitors such as Vicriviroc (VCV) involves changes solely in the gp41 fusion peptide (FP). Here, we show that the G516V change is critical to VCV resistance in PBMC and TZM-bl cells, although it must be accompanied by either M518V or F519I to have a substantial impact. Modeling VCV inhibition data from the two cell types indicated that G516V allows both double mutants to use VCV-CCR5 complexes for entry. The model further identified F519I as an independent determinant of preference for the unoccupied, high-VCV affinity form of CCR5. From inhibitor-free reversion cultures, we also identified a substitution in the inner domain of gp120, T244A, which appears to counter the resistance phenotype created by the FP substitutions. Examining the interplay of these changes will enhance our understanding of Env complex interactions that influence both HIV-1 entry and resistance to CCR5 inhibitors.

  7. Genome Sequence of the Palaeopolyploid soybean

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  8. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  9. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls....... In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant genotypes...

  10. Protecting genomic sequence anonymity with generalization lattices.

    Malin, B A

    2005-01-01

    Current genomic privacy technologies assume the identity of genomic sequence data is protected if personal information, such as demographics, are obscured, removed, or encrypted. While demographic features can directly compromise an individual's identity, recent research demonstrates such protections are insufficient because sequence data itself is susceptible to re-identification. To counteract this problem, we introduce an algorithm for anonymizing a collection of person-specific DNA sequences. The technique is termed DNA lattice anonymization (DNALA), and is based upon the formal privacy protection schema of k -anonymity. Under this model, it is impossible to observe or learn features that distinguish one genetic sequence from k-1 other entries in a collection. To maximize information retained in protected sequences, we incorporate a concept generalization lattice to learn the distance between two residues in a single nucleotide region. The lattice provides the most similar generalized concept for two residues (e.g. adenine and guanine are both purines). The method is tested and evaluated with several publicly available human population datasets ranging in size from 30 to 400 sequences. Our findings imply the anonymization schema is feasible for the protection of sequences privacy. The DNALA method is the first computational disclosure control technique for general DNA sequences. Given the computational nature of the method, guarantees of anonymity can be formally proven. There is room for improvement and validation, though this research provides the groundwork from which future researchers can construct genomics anonymization schemas tailored to specific datasharing scenarios.

  11. Genetic architecture of HIV-1 genes circulating in north India & their functional implications.

    Neogi, Ujjwal; Sood, Vikas; Ronsard, Larence; Singh, Jyotsna; Lata, Sneh; Ramachandran, V G; Das, S; Wanchu, Ajay; Banerjea, Akhil C

    2011-12-01

    This review presents data on genetic and functional analysis of some of the HIV-1 genes derived from HIV-1 infected individuals from north India (Delhi, Punjab and Chandigarh). We found evidence of novel B/C recombinants in HIV-1 LTR region showing relatedness to China/Myanmar with 3 copies of Nfκb sites; B/C/D mosaic genomes for HIV-1 Vpr and novel B/C Tat. We reported appearance of a complex recombinant form CRF_02AG of HIV-1 envelope sequences which is predominantly found in Central/Western Africa. Also one Indian HIV-1 envelope subtype C sequence suggested exclusive CXCR4 co-receptor usage. This extensive recombination, which is observed in about 10 per cent HIV-1 infected individuals in the Vpr genes, resulted in remarkably altered functions when compared with prototype subtype B Vpr. The Vpu C was found to be more potent in causing apoptosis when compared with Vpu B when analyzed for subG1 DNA content. The functional implications of these changes as well as in other genes of HIV-1 are discussed in detail with possible implications for subtype-specific pathogenesis highlighted.

  12. Complete Genome Sequences of 44 Arthrobacter Phages.

    Klyczek, Karen K; Jacobs-Sera, Deborah; Adair, Tamarah L; Adams, Sandra D; Ball, Sarah L; Benjamin, Robert C; Bonilla, J Alfred; Breitenberger, Caroline A; Daniels, Charles J; Gaffney, Bobby L; Harrison, Melinda; Hughes, Lee E; King, Rodney A; Krukonis, Gregory P; Lopez, A Javier; Monsen-Collar, Kirsten; Pizzorno, Marie C; Rinehart, Claire A; Staples, Amanda K; Stowe, Emily L; Garlena, Rebecca A; Russell, Daniel A; Cresawn, Steven G; Pope, Welkin H; Hatfull, Graham F

    2018-02-01

    We report here the complete genome sequences of 44 phages infecting Arthrobacter sp. strain ATCC 21022. These phages have double-stranded DNA genomes with sizes ranging from 15,680 to 70,707 bp and G+C contents from 45.1% to 68.5%. All three tail types (belonging to the families Siphoviridae , Myoviridae , and Podoviridae ) are represented. Copyright © 2018 Klyczek et al.

  13. Microbial species delineation using whole genome sequences.

    Varghese, Neha J; Mukherjee, Supratim; Ivanova, Natalia; Konstantinidis, Konstantinos T; Mavrommatis, Kostas; Kyrpides, Nikos C; Pati, Amrita

    2015-08-18

    Increased sequencing of microbial genomes has revealed that prevailing prokaryotic species assignments can be inconsistent with whole genome information for a significant number of species. The long-standing need for a systematic and scalable species assignment technique can be met by the genome-wide Average Nucleotide Identity (gANI) metric, which is widely acknowledged as a robust measure of genomic relatedness. In this work, we demonstrate that the combination of gANI and the alignment fraction (AF) between two genomes accurately reflects their genomic relatedness. We introduce an efficient implementation of AF,gANI and discuss its successful application to 86.5M genome pairs between 13,151 prokaryotic genomes assigned to 3032 species. Subsequently, by comparing the genome clusters obtained from complete linkage clustering of these pairs to existing taxonomy, we observed that nearly 18% of all prokaryotic species suffer from anomalies in species definition. Our results can be used to explore central questions such as whether microorganisms form a continuum of genetic diversity or distinct species represented by distinct genetic signatures. We propose that this precise and objective AF,gANI-based species definition: the MiSI (Microbial Species Identifier) method, be used to address previous inconsistencies in species classification and as the primary guide for new taxonomic species assignment, supplemented by the traditional polyphasic approach, as required. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Genomic Sequence Variation Markup Language (GSVML).

    Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

    2010-02-01

    With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as

  15. Complete genome sequence of Ikoma lyssavirus.

    Marston, Denise A; Ellis, Richard J; Horton, Daniel L; Kuzmin, Ivan V; Wise, Emma L; McElhinney, Lorraine M; Banyard, Ashley C; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E; Fooks, Anthony R

    2012-09-01

    Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isolated from an African civet in Tanzania displaying clinical signs of rabies. Genetically, this virus is the most divergent within the genus Lyssavirus. Characterization of the genome will help to improve our understanding of lyssavirus diversity and enable investigation into vaccine-induced immunity and protection.

  16. A genome-wide analysis of lentivector integration sites using targeted sequence capture and next generation sequencing technology.

    Ustek, Duran; Sirma, Sema; Gumus, Ergun; Arikan, Muzaffer; Cakiris, Aris; Abaci, Neslihan; Mathew, Jaicy; Emrence, Zeliha; Azakli, Hulya; Cosan, Fulya; Cakar, Atilla; Parlak, Mahmut; Kursun, Olcay

    2012-10-01

    One application of next-generation sequencing (NGS) is the targeted resequencing of interested genes which has not been used in viral integration site analysis of gene therapy applications. Here, we combined targeted sequence capture array and next generation sequencing to address the whole genome profiling of viral integration sites. Human 293T and K562 cells were transduced with a HIV-1 derived vector. A custom made DNA probe sets targeted pLVTHM vector used to capture lentiviral vector/human genome junctions. The captured DNA was sequenced using GS FLX platform. Seven thousand four hundred and eighty four human genome sequences flanking the long terminal repeats (LTR) of pLVTHM fragment sequences matched with an identity of at least 98% and minimum 50 bp criteria in both cells. In total, 203 unique integration sites were identified. The integrations in both cell lines were totally distant from the CpG islands and from the transcription start sites and preferentially located in introns. A comparison between the two cell lines showed that the lentiviral-transduced DNA does not have the same preferred regions in the two different cell lines. Copyright © 2012 Elsevier B.V. All rights reserved.

  17. Complete Genome Sequence of Ikoma Lyssavirus

    Marston, Denise A.; Ellis, Richard J.; Horton, Daniel L.; Kuzmin, Ivan V.; Wise, Emma L.; McElhinney, Lorraine M.; Banyard, Ashley C.; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E.; Fooks, Anthony R.

    2012-01-01

    Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isol...

  18. Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.

    ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong

    2018-05-15

    We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.

  19. Near Full-Length Identification of a Novel HIV-1 CRF01_AE/B/C Recombinant in Northern Myanmar.

    Zhou, Yan-Heng; Chen, Xin; Liang, Yue-Bo; Pang, Wei; Qin, Wei-Hong; Zhang, Chiyu; Zheng, Yong-Tang

    2015-08-01

    The Myanmar-China border appears to be the "hot spot" region for the occurrence of HIV-1 recombination. The majority of the previous analyses of HIV-1 recombination were based on partial genomic sequences, which obviously cannot reflect the reality of the genetic diversity of HIV-1 in this area well. Here, we present a near full-length characterization of a novel HIV-1 CRF01_AE/B/C recombinant isolated from a long-distance truck driver in Northern Myanmar. It is the first description of a near full-length genomic sequence in Myanmar since 2003, and might be one of the most complicated HIV-1 chimeras ever detected in Myanmar, containing four CRF01_AE, six B segments, and five C segments separated by 14 breakpoints throughout its genome. The discovery and characterization of this new CRF01_AE/B/C recombinant indicate that intersubtype recombination is ongoing in Myanmar, continuously generating new forms of HIV-1. More work based on near full-length sequence analyses is urgently needed to better understand the genetic diversity of HIV-1 in these regions.

  20. Genome sequencing for obstetricians & gynaecologists | Kent ...

    The medical profession has been waiting for a decade to be invigorated by the sequencing of the human genome, arguably the greatest scientific project ever. The technology has been spectacular but the results of the project have yielded more unexpected results than definitive answers – many about the very nature of our ...

  1. Genome shotgun sequencing and development of microsatellite ...

    Analysis of the gerbera genome DNA ('Raon') general library showed that sequences of (AT), (AG), (AAG) and (AAT) repeats appeared most often, whereas (AC), (AAC) and (ACC) were the least frequent. Primer pairs were designed for 80 loci. Only eight primer pairs produced reproducible polymorphic bands in the 28 ...

  2. Whole-genome sequencing of veterinary pathogens

    Ronco, Troels

    -electrophoresis and single-locus sequencing has been widely used to characterize such types of veterinary pathogens. However, DNA sequencing techniques have become fast and cost effective in recent years and whole-genome sequencing data provide a much higher discriminative power and reproducibility than any...... genetic background. This indicates that dairy cows can be natural carriers of S. aureus subtypes that in certain cases lead to CM. A group of isolates that mostly belonged to ST151 carried three pathogenicity islands that were primarily found in this group. The prevalence of resistance genes was generally...

  3. HIV-1 transmission linkage in an HIV-1 prevention clinical trial

    Leitner, Thomas [Los Alamos National Laboratory; Campbell, Mary S [UNIV OF WASHINGTON; Mullins, James I [UNIV OF WASHINGTON; Hughes, James P [UNIV OF WASHINGTON; Wong, Kim G [UNIV OF WASHINGTON; Raugi, Dana N [UNIV OF WASHINGTON; Scrensen, Stefanie [UNIV OF WASHINGTON

    2009-01-01

    HIV-1 sequencing has been used extensively in epidemiologic and forensic studies to investigate patterns of HIV-1 transmission. However, the criteria for establishing genetic linkage between HIV-1 strains in HIV-1 prevention trials have not been formalized. The Partners in Prevention HSV/HIV Transmission Study (ClinicaITrials.gov NCT00194519) enrolled 3408 HIV-1 serodiscordant heterosexual African couples to determine the efficacy of genital herpes suppression with acyclovir in reducing HIV-1 transmission. The trial analysis required laboratory confirmation of HIV-1 linkage between enrolled partners in couples in which seroconversion occurred. Here we describe the process and results from HIV-1 sequencing studies used to perform transmission linkage determination in this clinical trial. Consensus Sanger sequencing of env (C2-V3-C3) and gag (p17-p24) genes was performed on plasma HIV-1 RNA from both partners within 3 months of seroconversion; env single molecule or pyrosequencing was also performed in some cases. For linkage, we required monophyletic clustering between HIV-1 sequences in the transmitting and seroconverting partners, and developed a Bayesian algorithm using genetic distances to evaluate the posterior probability of linkage of participants sequences. Adjudicators classified transmissions as linked, unlinked, or indeterminate. Among 151 seroconversion events, we found 108 (71.5%) linked, 40 (26.5%) unlinked, and 3 (2.0%) to have indeterminate transmissions. Nine (8.3%) were linked by consensus gag sequencing only and 8 (7.4%) required deep sequencing of env. In this first use of HIV-1 sequencing to establish endpoints in a large clinical trial, more than one-fourth of transmissions were unlinked to the enrolled partner, illustrating the relevance of these methods in the design of future HIV-1 prevention trials in serodiscordant couples. A hierarchy of sequencing techniques, analysis methods, and expert adjudication contributed to the linkage

  4. Agaricus bisporus genome sequence: a commentary.

    Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

    2013-06-01

    The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and β-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Genome sequence of Aspergillus luchuensis NBRC 4314

    Yamada, Osamu; Machida, Masayuki; Hosoyama, Akira; Goto, Masatoshi; Takahashi, Toru; Futagami, Taiki; Yamagata, Youhei; Takeuchi, Michio; Kobayashi, Tetsuo; Koike, Hideaki; Abe, Keietsu; Asai, Kiyoshi; Arita, Masanori; Fujita, Nobuyuki; Fukuda, Kazuro; Higa, Ken-ichi; Horikawa, Hiroshi; Ishikawa, Takeaki; Jinno, Koji; Kato, Yumiko; Kirimura, Kohtaro; Mizutani, Osamu; Nakasone, Kaoru; Sano, Motoaki; Shiraishi, Yohei; Tsukahara, Masatoshi; Gomi, Katsuya

    2016-01-01

    Awamori is a traditional distilled beverage made from steamed Thai-Indica rice in Okinawa, Japan. For brewing the liquor, two microbes, local kuro (black) koji mold Aspergillus luchuensis and awamori yeast Saccharomyces cerevisiae are involved. In contrast, that yeasts are used for ethanol fermentation throughout the world, a characteristic of Japanese fermentation industries is the use of Aspergillus molds as a source of enzymes for the maceration and saccharification of raw materials. Here we report the draft genome of a kuro (black) koji mold, A. luchuensis NBRC 4314 (RIB 2604). The total length of nonredundant sequences was nearly 34.7 Mb, comprising approximately 2,300 contigs with 16 telomere-like sequences. In total, 11,691 genes were predicted to encode proteins. Most of the housekeeping genes, such as transcription factors and N-and O-glycosylation system, were conserved with respect to Aspergillus niger and Aspergillus oryzae. An alternative oxidase and acid-stable α-amylase regarding citric acid production and fermentation at a low pH as well as a unique glutamic peptidase were also found in the genome. Furthermore, key biosynthetic gene clusters of ochratoxin A and fumonisin B were absent when compared with A. niger genome, showing the safety of A. luchuensis for food and beverage production. This genome information will facilitate not only comparative genomics with industrial kuro-koji molds, but also molecular breeding of the molds in improvements of awamori fermentation. PMID:27651094

  6. Synaptotagmin gene content of the sequenced genomes

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  7. Zinc finger nuclease: a new approach for excising HIV-1 proviral DNA from infected human T cells.

    Qu, Xiying; Wang, Pengfei; Ding, Donglin; Wang, Xiaohui; Zhang, Gongmin; Zhou, Xin; Liu, Lin; Zhu, Xiaoli; Zeng, Hanxian; Zhu, Huanzhang

    2014-09-01

    A major reason that Acquired Immune Deficiency Syndrome (AIDS) cannot be completely cured is the human immunodeficiency virus 1 (HIV-1) provirus integrated into the human genome. Though existing therapies can inhibit replication of HIV-1, they cannot eradicate it. A molecular therapy gains popularity due to its specifically targeting to HIV-1 infected cells and effectively removing the HIV-1, regardless of viral genes being active or dormant. Now, we propose a new method which can excellently delete the HIV provirus from the infected human T cell genome. First, we designed zinc-finger nucleases (ZFNs) that target a sequence within the long terminal repeat (LTR) U3 region that is highly conserved in whole clade. Then, we screened out one pair of ZFN and named it as ZFN-U3. We discovered that ZFN-U3 can exactly target and eliminate the full-length HIV-1 proviral DNA after the infected human cell lines treated with it, and the frequency of its excision was about 30 % without cytotoxicity. These results prove that ZFN-U3 can efficiently excise integrated HIV-1 from the human genome in infected cells. This method to delete full length HIV-1 in human genome can therefore provide a novel approach to cure HIV-infected individuals in the future.

  8. Whole genome sequence analysis of Mycobacterium suricattae

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; Van Der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah; Siame, Kabengele Keith; Gey Van Pittius, Nicolaas Claudius; Van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-01-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  9. Whole genome sequence analysis of Mycobacterium suricattae

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  10. IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform.

    N Lance Hepler

    2014-09-01

    Full Text Available Since its identification in 1983, HIV-1 has been the focus of a research effort unprecedented in scope and difficulty, whose ultimate goals--a cure and a vaccine--remain elusive. One of the fundamental challenges in accomplishing these goals is the tremendous genetic variability of the virus, with some genes differing at as many as 40% of nucleotide positions among circulating strains. Because of this, the genetic bases of many viral phenotypes, most notably the susceptibility to neutralization by a particular antibody, are difficult to identify computationally. Drawing upon open-source general-purpose machine learning algorithms and libraries, we have developed a software package IDEPI (IDentify EPItopes for learning genotype-to-phenotype predictive models from sequences with known phenotypes. IDEPI can apply learned models to classify sequences of unknown phenotypes, and also identify specific sequence features which contribute to a particular phenotype. We demonstrate that IDEPI achieves performance similar to or better than that of previously published approaches on four well-studied problems: finding the epitopes of broadly neutralizing antibodies (bNab, determining coreceptor tropism of the virus, identifying compartment-specific genetic signatures of the virus, and deducing drug-resistance associated mutations. The cross-platform Python source code (released under the GPL 3.0 license, documentation, issue tracking, and a pre-configured virtual machine for IDEPI can be found at https://github.com/veg/idepi.

  11. Sequencing of a Cultivated Diploid Cotton Genome-Gossypium arboreum

    WILKINS; Thea; A

    2008-01-01

    Sequencing the genomes of crop species and model systems contributes significantly to our understanding of the organization,structure and function of plant genomes.In a `white paper' published in 2007,the cotton community set forth a strategic plan for sequencing the AD genome of cultivated upland cotton that initially targets less complex diploid genomes.This strategy banks on the high degree

  12. From Genome Sequence to Taxonomy - A Skeptic’s View

    Özen, Asli Ismihan; Vesth, Tammi Camilla; Ussery, David

    2012-01-01

    The relative ease of sequencing bacterial genomes has resulted in thousands of sequenced bacterial genomes available in the public databases. This same technology now allows for using the entire genome sequence as an identifier for an organism. There are many methods available which attempt to us...

  13. Frequency and site mapping of HIV-1/SIVcpz, HIV- 2/SIVsmm and ...

    out to analyze the effects of various restriction enzymes on the HIV genome. A computer simulated model using Web cutter Version 2.0, and cytogenetic analysis. 339 restriction enzymes from Promega database, 10 HIV-1/SIVcpz genes, 10 HIV-2/SIVsmm genes and 10 other SIV genes. Gene sequences were fed into Web ...

  14. Geographic and temporal trends in the molecular epidemiology and genetic mechanisms of transmitted HIV-1 drug resistance: an individual-patient- and sequence-level meta-analysis.

    Rhee, Soo-Yon; Blanco, Jose Luis; Jordan, Michael R; Taylor, Jonathan; Lemey, Philippe; Varghese, Vici; Hamers, Raph L; Bertagnolio, Silvia; Rinke de Wit, Tobias F; Aghokeng, Avelin F; Albert, Jan; Avi, Radko; Avila-Rios, Santiago; Bessong, Pascal O; Brooks, James I; Boucher, Charles A B; Brumme, Zabrina L; Busch, Michael P; Bussmann, Hermann; Chaix, Marie-Laure; Chin, Bum Sik; D'Aquin, Toni T; De Gascun, Cillian F; Derache, Anne; Descamps, Diane; Deshpande, Alaka K; Djoko, Cyrille F; Eshleman, Susan H; Fleury, Herve; Frange, Pierre; Fujisaki, Seiichiro; Harrigan, P Richard; Hattori, Junko; Holguin, Africa; Hunt, Gillian M; Ichimura, Hiroshi; Kaleebu, Pontiano; Katzenstein, David; Kiertiburanakul, Sasisopin; Kim, Jerome H; Kim, Sung Soon; Li, Yanpeng; Lutsar, Irja; Morris, Lynn; Ndembi, Nicaise; Ng, Kee Peng; Paranjape, Ramesh S; Peeters, Martine; Poljak, Mario; Price, Matt A; Ragonnet-Cronin, Manon L; Reyes-Terán, Gustavo; Rolland, Morgane; Sirivichayakul, Sunee; Smith, Davey M; Soares, Marcelo A; Soriano, Vincent V; Ssemwanga, Deogratius; Stanojevic, Maja; Stefani, Mariane A; Sugiura, Wataru; Sungkanuparph, Somnuek; Tanuri, Amilcar; Tee, Kok Keng; Truong, Hong-Ha M; van de Vijver, David A M C; Vidal, Nicole; Yang, Chunfu; Yang, Rongge; Yebra, Gonzalo; Ioannidis, John P A; Vandamme, Anne-Mieke; Shafer, Robert W

    2015-04-01

    Regional and subtype-specific mutational patterns of HIV-1 transmitted drug resistance (TDR) are essential for informing first-line antiretroviral (ARV) therapy guidelines and designing diagnostic assays for use in regions where standard genotypic resistance testing is not affordable. We sought to understand the molecular epidemiology of TDR and to identify the HIV-1 drug-resistance mutations responsible for TDR in different regions and virus subtypes. We reviewed all GenBank submissions of HIV-1 reverse transcriptase sequences with or without protease and identified 287 studies published between March 1, 2000, and December 31, 2013, with more than 25 recently or chronically infected ARV-naïve individuals. These studies comprised 50,870 individuals from 111 countries. Each set of study sequences was analyzed for phylogenetic clustering and the presence of 93 surveillance drug-resistance mutations (SDRMs). The median overall TDR prevalence in sub-Saharan Africa (SSA), south/southeast Asia (SSEA), upper-income Asian countries, Latin America/Caribbean, Europe, and North America was 2.8%, 2.9%, 5.6%, 7.6%, 9.4%, and 11.5%, respectively. In SSA, there was a yearly 1.09-fold (95% CI: 1.05-1.14) increase in odds of TDR since national ARV scale-up attributable to an increase in non-nucleoside reverse transcriptase inhibitor (NNRTI) resistance. The odds of NNRTI-associated TDR also increased in Latin America/Caribbean (odds ratio [OR] = 1.16; 95% CI: 1.06-1.25), North America (OR = 1.19; 95% CI: 1.12-1.26), Europe (OR = 1.07; 95% CI: 1.01-1.13), and upper-income Asian countries (OR = 1.33; 95% CI: 1.12-1.55). In SSEA, there was no significant change in the odds of TDR since national ARV scale-up (OR = 0.97; 95% CI: 0.92-1.02). An analysis limited to sequences with mixtures at less than 0.5% of their nucleotide positions—a proxy for recent infection—yielded trends comparable to those obtained using the complete dataset. Four NNRTI SDRMs—K101E, K103N, Y181C, and G190A

  15. Next Generation DNA Sequencing and the Future of Genomic Medicine

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpreta...

  16. Transforming clinical microbiology with bacterial genome sequencing.

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W

    2012-09-01

    Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.

  17. Evidence of at Least Two Introductions of HIV-1 in the Amerindian Warao Population from Venezuela

    Rangel, Héctor R.; Maes, Mailis; Villalba, Julian; Sulbarán, Yoneira; de Waard, Jacobus H.; Bello, Gonzalo; Pujol, Flor H.

    2012-01-01

    Background The Venezuelan Amerindians were, until recently, free of human immunodeficiency virus (HIV) infection. However, in 2007, HIV-1 infection was detected for the first time in the Warao Amerindian population living in the Eastern part of Venezuela, in the delta of the Orinoco river. The aim of this study was to analyze the genetic diversity of the HIV-1 circulating in this population. Methodology/Principal Findings The pol genomic region was sequenced for 16 HIV-1 isolates and for some of them, sequences from env, vif and nef genomic regions were obtained. All HIV-1 isolates were classified as subtype B, with exception of one that was classified as subtype C. The 15 subtype B isolates exhibited a high degree of genetic similarity and formed a highly supported monophyletic cluster in each genomic region analyzed. Evolutionary analyses of the pol genomic region indicated that the date of the most recent common ancestor of the Waraos subtype B clade dates back to the late 1990s. Conclusions/Significance At least two independent introductions of HIV-1 have occurred in the Warao Amerindians from Venezuela. The HIV-1 subtype B was successfully established and got disseminated in the community, while no evidence of local dissemination of the HIV-1 subtype C was detected in this study. These results warrant further surveys to evaluate the burden of this disease, which can be particularly devastating in this Amerindian population, with a high prevalence of tuberculosis, hepatitis B, among other infectious diseases, and with limited access to primary health care. PMID:22808212

  18. Molecular beacon probes-base multiplex NASBA Real-time for detection of HIV-1 and HCV.

    Mohammadi-Yeganeh, S; Paryan, M; Mirab Samiee, S; Kia, V; Rezvan, H

    2012-06-01

    Developed in 1991, nucleic acid sequence-based amplification (NASBA) has been introduced as a rapid molecular diagnostic technique, where it has been shown to give quicker results than PCR, and it can also be more sensitive. This paper describes the development of a molecular beacon-based multiplex NASBA assay for simultaneous detection of HIV-1 and HCV in plasma samples. A well-conserved region in the HIV-1 pol gene and 5'-NCR of HCV genome were used for primers and molecular beacon design. The performance features of HCV/HIV-1 multiplex NASBA assay including analytical sensitivity and specificity, clinical sensitivity and clinical specificity were evaluated. The analysis of scalar concentrations of the samples indicated that the limit of quantification of the assay was beacon probes detected all HCV genotypes and all major variants of HIV-1. This method may represent a relatively inexpensive isothermal method for detection of HIV-1/HCV co-infection in monitoring of patients.

  19. gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing.

    Olejnik, Michael; Steuwer, Michel; Gorlatch, Sergei; Heider, Dominik

    2014-11-15

    Next-generation sequencing (NGS) has a large potential in HIV diagnostics, and genotypic prediction models have been developed and successfully tested in the recent years. However, albeit being highly accurate, these computational models lack computational efficiency to reach their full potential. In this study, we demonstrate the use of graphics processing units (GPUs) in combination with a computational prediction model for HIV tropism. Our new model named gCUP, parallelized and optimized for GPU, is highly accurate and can classify >175 000 sequences per second on an NVIDIA GeForce GTX 460. The computational efficiency of our new model is the next step to enable NGS technologies to reach clinical significance in HIV diagnostics. Moreover, our approach is not limited to HIV tropism prediction, but can also be easily adapted to other settings, e.g. drug resistance prediction. The source code can be downloaded at http://www.heiderlab.de d.heider@wz-straubing.de. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. An automated annotation tool for genomic DNA sequences using

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  1. An evaluation of Comparative Genome Sequencing (CGS by comparing two previously-sequenced bacterial genomes

    Herring Christopher D

    2007-08-01

    Full Text Available Abstract Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions.

  2. Approaches for in silico finishing of microbial genome sequences

    Frederico Schmitt Kremer

    Full Text Available Abstract The introduction of next-generation sequencing (NGS had a significant effect on the availability of genomic information, leading to an increase in the number of sequenced genomes from a large spectrum of organisms. Unfortunately, due to the limitations implied by the short-read sequencing platforms, most of these newly sequenced genomes remained as “drafts”, incomplete representations of the whole genetic content. The previous genome sequencing studies indicated that finishing a genome sequenced by NGS, even bacteria, may require additional sequencing to fill the gaps, making the entire process very expensive. As such, several in silico approaches have been developed to optimize the genome assemblies and facilitate the finishing process. The present review aims to explore some free (open source, in many cases tools that are available to facilitate genome finishing.

  3. Approaches for in silico finishing of microbial genome sequences.

    Kremer, Frederico Schmitt; McBride, Alan John Alexander; Pinto, Luciano da Silva

    The introduction of next-generation sequencing (NGS) had a significant effect on the availability of genomic information, leading to an increase in the number of sequenced genomes from a large spectrum of organisms. Unfortunately, due to the limitations implied by the short-read sequencing platforms, most of these newly sequenced genomes remained as "drafts", incomplete representations of the whole genetic content. The previous genome sequencing studies indicated that finishing a genome sequenced by NGS, even bacteria, may require additional sequencing to fill the gaps, making the entire process very expensive. As such, several in silico approaches have been developed to optimize the genome assemblies and facilitate the finishing process. The present review aims to explore some free (open source, in many cases) tools that are available to facilitate genome finishing.

  4. Differential evolution of a CXCR4-using HIV-1 strain in CCR5wt/wt and CCR5∆32/∆32 hosts revealed by longitudinal deep sequencing and phylogenetic reconstruction.

    Le, Anh Q; Taylor, Jeremy; Dong, Winnie; McCloskey, Rosemary; Woods, Conan; Danroth, Ryan; Hayashi, Kanna; Milloy, M-J; Poon, Art F Y; Brumme, Zabrina L

    2015-12-03

    Rare individuals homozygous for a naturally-occurring 32 base pair deletion in the CCR5 gene (CCR5∆32/∆32) are resistant to infection by CCR5-using ("R5") HIV-1 strains but remain susceptible to less common CXCR4-using ("X4") strains. The evolutionary dynamics of X4 infections however, remain incompletely understood. We identified two individuals, one CCR5wt/wt and one CCR5∆32/∆32, within the Vancouver Injection Drug Users Study who were infected with a genetically similar X4 HIV-1 strain. While early-stage plasma viral loads were comparable in the two individuals (~4.5-5 log10 HIV-1 RNA copies/ml), CD4 counts in the CCR5wt/wt individual reached a nadir of 250 cells/mm(3) in the CCR5∆32/∆32 individual. Ancestral phylogenetic reconstructions using longitudinal envelope-V3 deep sequences suggested that both individuals were infected by a single transmitted/founder (T/F) X4 virus that differed at only one V3 site (codon 24). While substantial within-host HIV-1 V3 diversification was observed in plasma and PBMC in both individuals, the CCR5wt/wt individual's HIV-1 population gradually reverted from 100% X4 to ~60% R5 over ~4 years whereas the CCR5∆32/∆32 individual's remained consistently X4. Our observations illuminate early dynamics of X4 HIV-1 infections and underscore the influence of CCR5 genotype on HIV-1 V3 evolution.

  5. Genomic signal processing for DNA sequence clustering.

    Mendizabal-Ruiz, Gerardo; Román-Godínez, Israel; Torres-Ramos, Sulema; Salido-Ruiz, Ricardo A; Vélez-Pérez, Hugo; Morales, J Alejandro

    2018-01-01

    Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

  6. Draft Genome Sequence of Mycobacterium chimaera Type ...

    We report the draft genome sequence of the type strain Mycobacterium chimaera Fl-0169T, a member of the Mycobacterium avium complex (MAC). M. chimaera Fl-0169T was isolated from a patient in Italy and is highly similar to strains of M. chimaera isolated in Ireland, though Fl-0169T possesses unique virulence genes. Evidence suggests that M. avium, M. intracellulare, and M. chimaera are differently virulent and a comparative genomic analysis is critically needed to identify diagnostic targets that reliably differentiate species of MAC. With treatment costs for Mycobacterium infections estimated to be >$1.8 B annually in the U.S., correct species identification will result in improved treatment selection, lower costs, and improved patient outcomes.

  7. Cascade detection for the extraction of localized sequence features; specificity results for HIV-1 protease and structure-function results for the Schellman loop.

    Newell, Nicholas E

    2011-12-15

    The extraction of the set of features most relevant to function from classified biological sequence sets is still a challenging problem. A central issue is the determination of expected counts for higher order features so that artifact features may be screened. Cascade detection (CD), a new algorithm for the extraction of localized features from sequence sets, is introduced. CD is a natural extension of the proportional modeling techniques used in contingency table analysis into the domain of feature detection. The algorithm is successfully tested on synthetic data and then applied to feature detection problems from two different domains to demonstrate its broad utility. An analysis of HIV-1 protease specificity reveals patterns of strong first-order features that group hydrophobic residues by side chain geometry and exhibit substantial symmetry about the cleavage site. Higher order results suggest that favorable cooperativity is weak by comparison and broadly distributed, but indicate possible synergies between negative charge and hydrophobicity in the substrate. Structure-function results for the Schellman loop, a helix-capping motif in proteins, contain strong first-order features and also show statistically significant cooperativities that provide new insights into the design of the motif. These include a new 'hydrophobic staple' and multiple amphipathic and electrostatic pair features. CD should prove useful not only for sequence analysis, but also for the detection of multifactor synergies in cross-classified data from clinical studies or other sources. Windows XP/7 application and data files available at: https://sites.google.com/site/cascadedetect/home. nacnewell@comcast.net Supplementary information is available at Bioinformatics online.

  8. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    Black, PA; Vos, M. de; Louw, GE; Merwe, RG van der; Dippenaar, A.; Streicher, EM; Abdallah, AM; Sampson, SL; Victor, TC; Dolby, T.; Simpson, JA; Helden, PD van; Warren, RM; Pain, Arnab

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug

  9. Insights from 20 years of bacterial genome sequencing

    Land, Miriam; Hauser, Loren; Jun, Se-Ran

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along...... the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative...... genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling...

  10. Mining olive genome through library sequencing and bioinformatics ...

    As one of the initial steps of olive (Olea europaea L.) genome analysis, a small insert genomic DNA library was constructed (digesting olive genomic DNA with SmaI and cloning the digestion products into pUC19 vector) and randomly picked 83 colonies were sequenced. Analysis of the insert sequences revealed 12 clones ...

  11. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion...... environment. The genome sequence predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria....

  12. Rapid whole genome sequencing and precision neonatology.

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Viral linkage in HIV-1 seroconverters and their partners in an HIV-1 prevention clinical trial.

    Mary S Campbell

    2011-03-01

    Full Text Available Characterization of viruses in HIV-1 transmission pairs will help identify biological determinants of infectiousness and evaluate candidate interventions to reduce transmission. Although HIV-1 sequencing is frequently used to substantiate linkage between newly HIV-1 infected individuals and their sexual partners in epidemiologic and forensic studies, viral sequencing is seldom applied in HIV-1 prevention trials. The Partners in Prevention HSV/HIV Transmission Study (ClinicalTrials.gov #NCT00194519 was a prospective randomized placebo-controlled trial that enrolled serodiscordant heterosexual couples to determine the efficacy of genital herpes suppression in reducing HIV-1 transmission; as part of the study analysis, HIV-1 sequences were examined for genetic linkage between seroconverters and their enrolled partners.We obtained partial consensus HIV-1 env and gag sequences from blood plasma for 151 transmission pairs and performed deep sequencing of env in some cases. We analyzed sequences with phylogenetic techniques and developed a Bayesian algorithm to evaluate the probability of linkage. For linkage, we required monophyletic clustering between enrolled partners' sequences and a Bayesian posterior probability of ≥ 50%. Adjudicators classified each seroconversion, finding 108 (71.5% linked, 40 (26.5% unlinked, and 3 (2.0% indeterminate transmissions, with linkage determined by consensus env sequencing in 91 (84%. Male seroconverters had a higher frequency of unlinked transmissions than female seroconverters. The likelihood of transmission from the enrolled partner was related to time on study, with increasing numbers of unlinked transmissions occurring after longer observation periods. Finally, baseline viral load was found to be significantly higher among linked transmitters.In this first use of HIV-1 sequencing to establish endpoints in a large clinical trial, more than one-fourth of transmissions were unlinked to the enrolled partner

  14. HIV-1 envelope glycoprotein

    Caulfield, Michael; Cupo, Albert; Dean, Hansi; Hoffenberg, Simon; King, C. Richter; Klasse, P. J.; Marozsan, Andre; Moore, John P.; Sanders, Rogier W.; Ward, Andrew; Wilson, Ian; Julien, Jean-Philippe

    2017-08-22

    The present application relates to novel HIV-1 envelope glycoproteins, which may be utilized as HIV-1 vaccine immunogens, and antigens for crystallization, electron microscopy and other biophysical, biochemical and immunological studies for the identification of broad neutralizing antibodies. The present invention encompasses the preparation and purification of immunogenic compositions, which are formulated into the vaccines of the present invention.

  15. Building the sequence map of the human pan-genome

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng

    2010-01-01

    analysis of predicted genes indicated that the novel sequences contain potentially functional coding regions. We estimate that a complete human pan-genome would contain approximately 19-40 Mb of novel sequence not present in the extant reference genome. The extensive amount of novel sequence contributing...

  16. Get your high-quality low-cost genome sequence

    Faino, L.; Thomma, B.P.H.J.

    2014-01-01

    The study of whole-genome sequences has become essential for almost all branches of biological research. Next-generation sequencing (NGS) has revolutionized the scalability, speed, and resolution of sequencing and brought genomic science within reach of academic laboratories that study non-model

  17. The diploid genome sequence of an Asian individual

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...... used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP...... identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J...

  18. Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)

    Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. Genome sequencing and annotation of Stenotrophomonas sp. SAM8

    Samy Selim

    2015-12-01

    Full Text Available We report draft genome sequence of Stenotrophomonas sp. strain SAM8, isolated from environmental water. The draft genome size is 3,665,538 bp with a G + C content of 67.2% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDAV00000000.

  20. Genome sequencing and annotation of Proteus sp. SAS71

    Samy Selim

    2015-12-01

    Full Text Available We report draft genome sequence of Proteus sp. strain SAS71, isolated from water spring in Aljouf region, Saudi Arabia. The draft genome size is 3,037,704 bp with a G + C content of 39.3% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDIU00000000.

  1. Identifying HIV-1 dual infections

    Cornelissen Marion

    2007-09-01

    Full Text Available Abstract Transmission of human immunodeficiency virus (HIV is no exception to the phenomenon that a second, productive infection with another strain of the same virus is feasible. Experiments with RNA viruses have suggested that both coinfections (simultaneous infection with two strains of a virus and superinfections (second infection after a specific immune response to the first infecting strain has developed can result in increased fitness of the viral population. Concerns about dual infections with HIV are increasing. First, the frequent detection of superinfections seems to indicate that it will be difficult to develop a prophylactic vaccine. Second, HIV-1 superinfections have been associated with accelerated disease progression, although this is not true for all persons. In fact, superinfections have even been detected in persons controlling their HIV infections without antiretroviral therapy. Third, dual infections can give rise to recombinant viruses, which are increasingly found in the HIV-1 epidemic. Recombinants could have increased fitness over the parental strains, as in vitro models suggest, and could exhibit increased pathogenicity. Multiple drug resistant (MDR strains could recombine to produce a pan-resistant, transmittable virus. We will describe in this review what is presently known about super- and re-infection among ambient viral infections, as well as the first cases of HIV-1 superinfection, including HIV-1 triple infections. The clinical implications, the impact of the immune system, and the effect of anti-retroviral therapy will be covered, as will as the timing of HIV superinfection. The methods used to detect HIV-1 dual infections will be discussed in detail. To increase the likelihood of detecting a dual HIV-1 infection, pre-selection of patients can be done by serotyping, heteroduplex mobility assays (HMA, counting the degenerate base codes in the HIV-1 genotyping sequence, or surveying unexpected increases in the

  2. Genome Sequence of Lactobacillus plantarum Strain UCMA 3037

    Naz, Saima; Tareb, Raouf; Bernardeau, Marion; Vaisse, Melissa; Lucchetti-Miganeh, Celine; Rechenmann, Mathias; Vernoux, Jean-Paul

    2013-01-01

    Nucleic acid of the strain Lactobacillus plantarum UCMA 3037, isolated from raw milk camembert cheese in our laboratory, was sequenced. We present its draft genome sequence with the aim of studying its functional properties and relationship to the cheese ecosystem.

  3. Genome Sequence of Lactobacillus plantarum Strain UCMA 3037.

    Naz, Saima; Tareb, Raouf; Bernardeau, Marion; Vaisse, Melissa; Lucchetti-Miganeh, Celine; Rechenmann, Mathias; Vernoux, Jean-Paul

    2013-05-23

    Nucleic acid of the strain Lactobacillus plantarum UCMA 3037, isolated from raw milk camembert cheese in our laboratory, was sequenced. We present its draft genome sequence with the aim of studying its functional properties and relationship to the cheese ecosystem.

  4. A Snapshot of the Emerging Tomato Genome Sequence

    Lukas A. Mueller

    2009-03-01

    Full Text Available The genome of tomato ( L. is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States as part of the larger “International Solanaceae Genome Project (SOL: Systems Approach to Diversity and Adaptation” initiative. The tomato genome sequencing project uses an ordered bacterial artificial chromosome (BAC approach to generate a high-quality tomato euchromatic genome sequence for use as a reference genome for the Solanaceae and euasterids. Sequence is deposited at GenBank and at the SOL Genomics Network (SGN. Currently, there are around 1000 BACs finished or in progress, representing more than a third of the projected euchromatic portion of the genome. An annotation effort is also underway by the International Tomato Annotation Group. The expected number of genes in the euchromatin is ∼40,000, based on an estimate from a preliminary annotation of 11% of finished sequence. Here, we present this first snapshot of the emerging tomato genome and its annotation, a short comparison with potato ( L. sequence data, and the tools available for the researchers to exploit this new resource are also presented. In the future, whole-genome shotgun techniques will be combined with the BAC-by-BAC approach to cover the entire tomato genome. The high-quality reference euchromatic tomato sequence is expected to be near completion by 2010.

  5. A plant pathology perspective of fungal genome sequencing.

    Aylward, Janneke; Steenkamp, Emma T; Dreyer, Léanne L; Roets, Francois; Wingfield, Brenda D; Wingfield, Michael J

    2017-06-01

    The majority of plant pathogens are fungi and many of these adversely affect food security. This mini-review aims to provide an analysis of the plant pathogenic fungi for which genome sequences are publically available, to assess their general genome characteristics, and to consider how genomics has impacted plant pathology. A list of sequenced fungal species was assembled, the taxonomy of all species verified, and the potential reason for sequencing each of the species considered. The genomes of 1090 fungal species are currently (October 2016) in the public domain and this number is rapidly rising. Pathogenic species comprised the largest category (35.5 %) and, amongst these, plant pathogens are predominant. Of the 191 plant pathogenic fungal species with available genomes, 61.3 % cause diseases on food crops, more than half of which are staple crops. The genomes of plant pathogens are slightly larger than those of other fungal species sequenced to date and they contain fewer coding sequences in relation to their genome size. Both of these factors can be attributed to the expansion of repeat elements. Sequenced genomes of plant pathogens provide blueprints from which potential virulence factors were identified and from which genes associated with different pathogenic strategies could be predicted. Genome sequences have also made it possible to evaluate adaptability of pathogen genomes and genomic regions that experience selection pressures. Some genomic patterns, however, remain poorly understood and plant pathogen genomes alone are not sufficient to unravel complex pathogen-host interactions. Genomes, therefore, cannot replace experimental studies that can be complex and tedious. Ultimately, the most promising application lies in using fungal plant pathogen genomics to inform disease management and risk assessment strategies. This will ultimately minimize the risks of future disease outbreaks and assist in preparation for emerging pathogen outbreaks.

  6. Gene Discovery through Genomic Sequencing of Brucella abortus

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposit...

  7. MIPS: a database for protein sequences and complete genomes.

    Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D

    1998-01-01

    The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795

  8. HIV-1 subtype A gag variability and epitope evolution.

    Abidi, Syed Hani; Kalish, Marcia L; Abbas, Farhat; Rowland-Jones, Sarah; Ali, Syed

    2014-01-01

    The aim of this study was to examine the course of time-dependent evolution of HIV-1 subtype A on a global level, especially with respect to the dynamics of immunogenic HIV gag epitopes. We used a total of 1,893 HIV-1 subtype A gag sequences representing a timeline from 1985 through 2010, and 19 different countries in Africa, Europe and Asia. The phylogenetic relationship of subtype A gag and its epidemic dynamics was analysed through a Maximum Likelihood tree and Bayesian Skyline plot, genomic variability was measured in terms of G → A substitutions and Shannon entropy, and the time-dependent evolution of HIV subtype A gag epitopes was examined. Finally, to confirm observations on globally reported HIV subtype A sequences, we analysed the gag epitope data from our Kenyan, Pakistani, and Afghan cohorts, where both cohort-specific gene epitope variability and HLA restriction profiles of gag epitopes were examined. The most recent common ancestor of the HIV subtype A epidemic was estimated to be 1956 ± 1. A period of exponential growth began about 1980 and lasted for approximately 7 years, stabilized for 15 years, declined for 2-3 years, then stabilized again from about 2004. During the course of evolution, a gradual increase in genomic variability was observed that peaked in 2005-2010. We observed that the number of point mutations and novel epitopes in gag also peaked concurrently during 2005-2010. It appears that as the HIV subtype A epidemic spread globally, changing population immunogenetic pressures may have played a role in steering immune-evolution of this subtype in new directions. This trend is apparent in the genomic variability and epitope diversity of HIV-1 subtype A gag sequences.

  9. HIV-1 subtype A gag variability and epitope evolution.

    Syed Hani Abidi

    Full Text Available OBJECTIVE: The aim of this study was to examine the course of time-dependent evolution of HIV-1 subtype A on a global level, especially with respect to the dynamics of immunogenic HIV gag epitopes. METHODS: We used a total of 1,893 HIV-1 subtype A gag sequences representing a timeline from 1985 through 2010, and 19 different countries in Africa, Europe and Asia. The phylogenetic relationship of subtype A gag and its epidemic dynamics was analysed through a Maximum Likelihood tree and Bayesian Skyline plot, genomic variability was measured in terms of G → A substitutions and Shannon entropy, and the time-dependent evolution of HIV subtype A gag epitopes was examined. Finally, to confirm observations on globally reported HIV subtype A sequences, we analysed the gag epitope data from our Kenyan, Pakistani, and Afghan cohorts, where both cohort-specific gene epitope variability and HLA restriction profiles of gag epitopes were examined. RESULTS: The most recent common ancestor of the HIV subtype A epidemic was estimated to be 1956 ± 1. A period of exponential growth began about 1980 and lasted for approximately 7 years, stabilized for 15 years, declined for 2-3 years, then stabilized again from about 2004. During the course of evolution, a gradual increase in genomic variability was observed that peaked in 2005-2010. We observed that the number of point mutations and novel epitopes in gag also peaked concurrently during 2005-2010. CONCLUSION: It appears that as the HIV subtype A epidemic spread globally, changing population immunogenetic pressures may have played a role in steering immune-evolution of this subtype in new directions. This trend is apparent in the genomic variability and epitope diversity of HIV-1 subtype A gag sequences.

  10. Validation of rice genome sequence by optical mapping

    Pape Louise

    2007-08-01

    Full Text Available Abstract Background Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. Results To facilitate ongoing sequencing finishing and validation efforts, we have constructed a whole-genome SwaI optical restriction map of the rice genome. The physical map consists of 14 contigs, covering 12 chromosomes, with a total genome size of 382.17 Mb; this value is about 11% smaller than original estimates. 9 of the 14 optical map contigs are without gaps, covering chromosomes 1, 2, 3, 4, 5, 7, 8 10, and 12 in their entirety – including centromeres and telomeres. Alignments between optical and in silico restriction maps constructed from IRGSP (International Rice Genome Sequencing Project and TIGR (The Institute for Genomic Research genome sequence sources are comprehensive and informative, evidenced by map coverage across virtually all published gaps, discovery of new ones, and characterization of sequence misassemblies; all totalling ~14 Mb. Furthermore, since optical maps are ordered restriction maps, identified discordances are pinpointed on a reliable physical scaffold providing an independent resource for closure of gaps and rectification of misassemblies. Conclusion Analysis of sequence and optical mapping data effectively validates genome sequence assemblies constructed from large, repeat-rich genomes. Given this conclusion we envision new applications of such single molecule analysis that will merge advantages offered by high-resolution optical maps with inexpensive, but short sequence reads generated by emerging sequencing platforms. Lastly, map construction techniques presented here points the way to new types of comparative genome analysis that would focus on discernment of

  11. Microbial genome sequencing using optical mapping and Illumina sequencing

    Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...

  12. Why size really matters when sequencing plant genomes

    Kelly, L.J.; Leitch, A.R.; Fay, M. F.; Renny-Byfield, S.; Pellicer, J.; Macas, Jiří; Leitch, I.J.

    2012-01-01

    Roč. 5, č. 4 (2012), s. 415-425 ISSN 1755-0874 Institutional research plan: CEZ:AV0Z50510513 Institutional support: RVO:60077344 Keywords : C-value * genome assembly * genome size evolution * genome sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 0.924, year: 2012

  13. A computational genomics pipeline for prokaryotic sequencing projects.

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  14. Complete Genome Sequence of the Human Gut Symbiont Roseburia hominis

    Travis, Anthony J.; Kelly, Denise; Flint, Harry J

    2015-01-01

    We report here the complete genome sequence of the human gut symbiont Roseburia hominis A2-183(T) (= DSM 16839(T) = NCIMB 14029(T)), isolated from human feces. The genome is represented by a 3,592,125-bp chromosome with 3,405 coding sequences. A number of potential functions contributing to host...

  15. Draft genome sequence of the Coccolithovirus Emiliania huxleyi virus 203.

    Nissimov, Jozef I; Worthy, Charlotte A; Rooks, Paul; Napier, Johnathan A; Kimmance, Susan A; Henn, Matthew R; Ogata, Hiroyuki; Allen, Michael J

    2011-12-01

    The Coccolithoviridae are a recently discovered group of viruses that infect the marine coccolithophorid Emiliania huxleyi. Emiliania huxleyi virus 203 (EhV-203) has a 160- to 180-nm-diameter icosahedral structure and a genome of approximately 400 kbp, consisting of 464 coding sequences (CDSs). Here we describe the genomic features of EhV-203 together with a draft genome sequence and its annotation, highlighting the homology and heterogeneity of this genome in comparison with the EhV-86 reference genome.

  16. Draft genome sequence of the coccolithovirus Emiliania huxleyi virus 202.

    Nissimov, Jozef I; Worthy, Charlotte A; Rooks, Paul; Napier, Johnathan A; Kimmance, Susan A; Henn, Matthew R; Ogata, Hiroyuki; Allen, Michael J

    2012-02-01

    Emiliania huxleyi virus 202 (EhV-202) is a member of the Coccolithoviridae, a group of viruses that infect the marine coccolithophorid Emiliania huxleyi. EhV-202 has a 160- to 180-nm-diameter icosahedral structure and a genome of approximately 407 kbp, consisting of 485 coding sequences (CDSs). Here we describe the genomic features of EhV-202, together with a draft genome sequence and its annotation, highlighting the homology and heterogeneity of this genome in comparison with the EhV-86 reference genome.

  17. The sequence of the CA-SP1 junction accounts for the differential sensitivity of HIV-1 and SIV to the small molecule maturation inhibitor 3-O-{3',3'-dimethylsuccinyl}-betulinic acid

    Aiken Christopher

    2004-06-01

    Full Text Available Abstract Background Despite the effectiveness of currently available antiretroviral therapies in the treatment of HIV-1 infection, a continuing need exists for novel compounds that can be used in combination with existing drugs to slow the emergence of drug-resistant viruses. We previously reported that the small molecule 3-O-{3',3'-dimethylsuccinyl}-betulinic acid (DSB specifically inhibits HIV-1 replication by delaying the processing of the CA-SP1 junction in Pr55Gag. By contrast, SIVmac239 replicates efficiently in the presence of high concentrations of DSB. To determine whether sequence differences in the CA-SP1 junction can fully account for the differential sensitivity of HIV-1 and SIV to DSB, we engineered mutations in this region of two viruses and tested their sensitivity to DSB in replication assays using activated human primary CD4+ T cells. Results Substitution of the P2 and P1 residues of HIV-1 by the corresponding amino acids of SIV resulted in strong resistance to DSB, but the mutant virus replicated with reduced efficiency. Conversely, replication of an SIV mutant containing three amino acid substitutions in the CA-SP1 cleavage site was highly sensitive to DSB, and the mutations resulted in delayed cleavage of the CA-SP1 junction in the presence of the drug. Conclusions These results demonstrate that the CA-SP1 junction in Pr55Gag represents the primary viral target of DSB. They further suggest that the therapeutic application of DSB will be accompanied by emergence of mutant viruses that are highly resistant to the drug but which exhibit reduced fitness relative to wild type HIV-1.

  18. Human endogenous retrovirus K Gag coassembles with HIV-1 Gag and reduces the release efficiency and infectivity of HIV-1.

    Monde, Kazuaki; Contreras-Galindo, Rafael; Kaplan, Mark H; Markovitz, David M; Ono, Akira

    2012-10-01

    Human endogenous retroviruses (HERVs), which are remnants of ancestral retroviruses integrated into the human genome, are defective in viral replication. Because activation of HERV-K and coexpression of this virus with HIV-1 have been observed during HIV-1 infection, it is conceivable that HERV-K could affect HIV-1 replication, either by competition or by cooperation, in cells expressing both viruses. In this study, we found that the release efficiency of HIV-1 Gag was 3-fold reduced upon overexpression of HERV-K(CON) Gag. In addition, we observed that in cells expressing Gag proteins of both viruses, HERV-K(CON) Gag colocalized with HIV-1 Gag at the plasma membrane. Furthermore, HERV-K(CON) Gag was found to coassemble with HIV-1 Gag, as demonstrated by (i) processing of HERV-K(CON) Gag by HIV-1 protease in virions, (ii) coimmunoprecipitation of virion-associated HERV-K(CON) Gag with HIV-1 Gag, and (iii) rescue of a late-domain-defective HERV-K(CON) Gag by wild-type (WT) HIV-1 Gag. Myristylation-deficient HERV-K(CON) Gag localized to nuclei, suggesting cryptic nuclear trafficking of HERV-K Gag. Notably, unlike WT HERV-K(CON) Gag, HIV-1 Gag failed to rescue myristylation-deficient HERV-K(CON) Gag to the plasma membrane. Efficient colocalization and coassembly of HIV-1 Gag and HERV-K Gag also required nucleocapsid (NC). These results provide evidence that HIV-1 Gag heteromultimerizes with HERV-K Gag at the plasma membrane, presumably through NC-RNA interaction. Intriguingly, HERV-K Gag overexpression reduced not only HIV-1 release efficiency but also HIV-1 infectivity in a myristylation- and NC-dependent manner. Altogether, these results indicate that Gag proteins of endogenous retroviruses can coassemble with HIV-1 Gag and modulate the late phase of HIV-1 replication.

  19. Genome sequencing and annotation of Serratia sp. strain TEL.

    Lephoto, Tiisetso E; Gray, Vincent M

    2015-12-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.

  20. Genome sequencing and annotation of Serratia sp. strain TEL

    Tiisetso E. Lephoto

    2015-12-01

    Full Text Available We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410. This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926 collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.

  1. Genome sequencing and annotation of Serratia sp. strain TEL

    Lephoto, Tiisetso E.; Gray, Vincent M.

    2015-01-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.

  2. Differential trends in the codon usage patterns in HIV-1 genes.

    Aridaman Pandit

    Full Text Available Host-pathogen interactions underlie one of the most complex evolutionary phenomena resulting in continual adaptive genetic changes, where pathogens exploit the host's molecular resources for growth and survival, while hosts try to eliminate the pathogen. Deciphering the molecular basis of host-pathogen interactions is useful in understanding the factors governing pathogen evolution and disease propagation. In host-pathogen context, a balance between mutation, selection, and genetic drift is known to maintain codon bias in both organisms. Studies revealing determinants of the bias and its dynamics are central to the understanding of host-pathogen evolution. We considered the Human Immunodeficiency Virus (HIV type 1 and its human host to search for evolutionary signatures in the viral genome. Positive selection is known to dominate intra-host evolution of HIV-1, whereas high genetic variability underlies the belief that neutral processes drive inter-host differences. In this study, we analyze the codon usage patterns of HIV-1 genomes across all subtypes and clades sequenced over a period of 23 years. We show presence of unique temporal correlations in the codon bias of three HIV-1 genes illustrating differential adaptation of the HIV-1 genes towards the host preferred codons. Our results point towards gene-specific translational selection to be an important force driving the evolution of HIV-1 at the population level.

  3. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  4. Scrutinizing virus genome termini by high-throughput sequencing.

    Shasha Li

    Full Text Available Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell.

  5. Cis elements and trans-acting factors involved in the RNA dimerization of the human immunodeficiency virus HIV-1.

    Darlix, J L; Gabus, C; Nugeyre, M T; Clavel, F; Barré-Sinoussi, F

    1990-12-05

    The retroviral genome consists of two identical RNA molecules joined at their 5' ends by the Dimer Linkage Structure (DLS). To study the mechanism of dimerization and the DLS of HIV-1 RNA, large amounts of bona fide HIV-1 RNA and of mutants have been synthesized in vitro. We report that HIV-1 RNA forms dimeric molecules and that viral nucleocapsid (NC) protein NCp15 greatly activates dimerization. Deletion mutagenesis in the RNA 5' 1333 nucleotides indicated that a small domain of 100 nucleotides, located between positions 311 to 415 from the 5' end, is necessary and sufficient to promote HIV-1 RNA dimerization. This dimerization domain encompasses an encapsidation element located between the 5' splice donor site and initiator AUG of gag and shows little sequence variations in different strains of HIV-1. Furthermore, cross-linking analysis of the interactions between NC and HIV-1 RNA (311 to 415) locates a major contact site in the encapsidation element of HIV-1 RNA. The genomic RNA dimer is tightly associated with nucleocapsid protein molecules in avian and murine retroviruses, and this ribonucleoprotein structure is believed to be the template for reverse transcription. Genomic RNA-protein interactions have been analyzed in human immunodeficiency virus (HIV) virions and results showed that NC protein molecules are tightly bound to the genomic RNA dimer. Since retroviral RNA dimerization and packaging appear to be under the control of the same cis element, the encapsidation sequences, and trans-acting factor, the NC protein, they are probably related events in the course of virion assembly.

  6. Geographic and temporal trends in the molecular epidemiology and genetic mechanisms of transmitted HIV-1 drug resistance: an individual-patient- and sequence-level meta-analysis

    Rhee, Soo-Yon; Blanco, Jose Luis; Jordan, Michael R.; Taylor, Jonathan; Lemey, Philippe; Varghese, Vici; Hamers, Raph L.; Bertagnolio, Silvia; Rinke de Wit, Tobias F.; Aghokeng, Avelin F.; Albert, Jan; Avi, Radko; Avila-Rios, Santiago; Bessong, Pascal O.; Brooks, James I.; Boucher, Charles A. B.; Brumme, Zabrina L.; Busch, Michael P.; Bussmann, Hermann; Chaix, Marie-Laure; Chin, Bum Sik; D'Aquin, Toni T.; de Gascun, Cillian F.; Derache, Anne; Descamps, Diane; Deshpande, Alaka K.; Djoko, Cyrille F.; Eshleman, Susan H.; Fleury, Herve; Frange, Pierre; Fujisaki, Seiichiro; Harrigan, P. Richard; Hattori, Junko; Holguin, Africa; Hunt, Gillian M.; Ichimura, Hiroshi; Kaleebu, Pontiano; Katzenstein, David; Kiertiburanakul, Sasisopin; Kim, Jerome H.; Kim, Sung Soon; Li, Yanpeng; Lutsar, Irja; Morris, Lynn; Ndembi, Nicaise; Ng, Kee Peng; Paranjape, Ramesh S.; Peeters, Martine; Poljak, Mario; Price, Matt A.; Ragonnet-Cronin, Manon L.; Reyes-Terán, Gustavo; Rolland, Morgane; Sirivichayakul, Sunee; Smith, Davey M.; Soares, Marcelo A.; Soriano, Vincent V.; Ssemwanga, Deogratius; Stanojevic, Maja; Stefani, Mariane A.; Sugiura, Wataru; Sungkanuparph, Somnuek; Tanuri, Amilcar; tee, Kok Keng; Truong, Hong-Ha M.; van de Vijver, David A. M. C.; Vidal, Nicole; Yang, Chunfu; Yang, Rongge; Yebra, Gonzalo; Ioannidis, John P. A.; Vandamme, Anne-Mieke; Shafer, Robert W.

    2015-01-01

    Regional and subtype-specific mutational patterns of HIV-1 transmitted drug resistance (TDR) are essential for informing first-line antiretroviral (ARV) therapy guidelines and designing diagnostic assays for use in regions where standard genotypic resistance testing is not affordable. We sought to

  7. Geographic and Temporal Trends in the Molecular Epidemiology and Genetic Mechanisms of Transmitted HIV-1 Drug Resistance: An Individual-Patient- and Sequence-Level Meta-Analysis

    S.Y. Rhee (Soo Yoon); J.L. Blanco (Jose Luis); M.R. Jordan (Michael); J. Taylor (Jonathan); P. Lemey (Philippe); V. Varghese (Vici); R.L. Hamers (Raph); S. Bertagnolio (Silvia); M. De Wit (Meike); A.F. Aghokeng (Avelin); J. Albert (Jan); R. Avi (Radko); S. Avila-Rios (Santiago); P.O. Bessong (Pascal O.); J.I. Brooks (James I.); C.A.B. Boucher (Charles); Z.L. Brumme (Zabrina L.); M.P. Busch (Michael P.); H. Bussmann (Hermann); M.L. Chaix (Marie Laure); B.S. Chin (Bum Sik); T.T. D’Aquin (Toni T.); C. de Gascun (Cillian); A. Derache (Anne); D. Descamps (Diane); A.K. Deshpande (Alaka K.); C.F. Djoko (Cyrille F.); S.H. Eshleman (Susan H.); H. Fleury (Hervé); P. Frange (Pierre); S. Fujisaki (Seiichiro); P. Harrigan (Pr); J. Hattori (Junko); A. Holguin (Africa); G.M. Hunt (Gillian M.); H. Ichimura (Hiroshi); P. Kaleebu (Pontiano); D. Katzenstein (David); S. Kiertiburanakul (Sasisopin); J.H. Kim (Jerome H.); S.S. Kim (Sung Soon); Y. Li (Yanpeng); I. Lutsar (Irja); L. Morris (L.); N. Ndembi (Nicaise); K.P. NG (Kee Peng); R.S. Paranjape (Ramesh S.); M.C. Peeters (Marian); M. Poljak (Mario); M.A. Price (Matt A.); M.L. Ragonnet-Cronin (Manon L.); G. Reyes-Terán (Gustavo); M. Rolland (Morgane); S. Sirivichayakul (Sunee); D.M. Smith (Davey M.); M.A. Soares (Marcelo A.); V. Soriano (Virtudes); D. Ssemwanga (Deogratius); M. Stanojevic (Maja); M.A. Stefani (Mariane A.); W. Sugiura (Wataru); S. Sungkanuparph (Somnuek); A. Tanuri (Amilcar); K.K. Tee (Kok Keng); H.-H.M. Truong (Hong-Ha M.); D.A.M.C. van de Vijver (David); N. Vidal (Nicole); C. Yang (Chunfu); R. Yang (Rongge); G. Yebra (Gonzalo); J.P.A. Ioannidis (John); A.M. Vandamme (Anne Mieke); R.W. Shafer (Robert)

    2015-01-01

    textabstractRegional and subtype-specific mutational patterns of HIV-1 transmitted drug resistance (TDR) are essential for informing first-line antiretroviral (ARV) therapy guidelines and designing diagnostic assays for use in regions where standard genotypic resistance testing is not affordable. We

  8. Sequence similarity between the erythrocyte binding domain 1 of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals binding residues for the Duffy Antigen Receptor for Chemokines

    Bolton, Michael J; Garry, Robert F

    2011-01-01

    Abstract Background The surface glycoprotein (SU, gp120) of the human immunodeficiency virus (HIV) must bind to a chemokine receptor, CCR5 or CXCR4, to invade CD4+ cells. Plasmodium vivax uses the Duffy Binding Protein (DBP) to bind the Duffy Antigen Receptor for Chemokines (DARC) and invade reticulocytes. Results Variable loop 3 (V3) of HIV-1 SU and domain 1 of the Plasmodium vivax DBP share a sequence similarity. The site of amino acid sequence similarity was necessary, but not sufficient, ...

  9. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  10. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  11. HIV-1 pol diversity among female bar and hotel workers in Northern Tanzania.

    Kiwelu, Ireen E; Novitsky, Vladimir; Kituma, Elimsaada; Margolin, Lauren; Baca, Jeannie; Manongi, Rachel; Sam, Noel; Shao, John; McLane, Mary F; Kapiga, Saidi H; Essex, M

    2014-01-01

    A national ART program was launched in Tanzania in October 2004. Due to the existence of multiple HIV-1 subtypes and recombinant viruses co-circulating in Tanzania, it is important to monitor rates of drug resistance. The present study determined the prevalence of HIV-1 drug resistance mutations among ART-naive female bar and hotel workers, a high-risk population for HIV-1 infection in Moshi, Tanzania. A partial HIV-1 pol gene was analyzed by single-genome amplification and sequencing in 45 subjects (622 pol sequences total; median number of sequences per subject, 13; IQR 5-20) in samples collected in 2005. The prevalence of HIV-1 subtypes A1, C, and D, and inter-subtype recombinant viruses, was 36%, 29%, 9% and 27%, respectively. Thirteen different recombination patterns included D/A1/D, C/A1, A1/C/A1, A1/U/A1, C/U/A1, C/A1, U/D/U, D/A1/D, A1/C, A1/C, A2/C/A2, CRF10_CD/C/CRF10_CD and CRF35_AD/A1/CRF35_AD. CRF35_AD was identified in Tanzania for the first time. All recombinant viruses in this study were unique, suggesting ongoing recombination processes among circulating HIV-1 variants. The prevalence of multiple infections in this population was 16% (n = 7). Primary HIV-1 drug resistance mutations to RT inhibitors were identified in three (7%) subjects (K65R plus Y181C; N60D; and V106M). In some subjects, polymorphisms were observed at the RT positions 41, 69, 75, 98, 101, 179, 190, and 215. Secondary mutations associated with NNRTIs were observed at the RT positions 90 (7%) and 138 (6%). In the protease gene, three subjects (7%) had M46I/L mutations. All subjects in this study had HIV-1 subtype-specific natural polymorphisms at positions 36, 69, 89 and 93 that are associated with drug resistance in HIV-1 subtype B. These results suggested that HIV-1 drug resistance mutations and natural polymorphisms existed in this population before the initiation of the national ART program. With increasing use of ARV, these results highlight the importance of drug

  12. From Sequence to Morphology - Long-Range Correlations in Complete Sequenced Genomes

    T.A. Knoch (Tobias)

    2004-01-01

    textabstractThe largely unresolved sequential organization, i.e. the relations within DNA sequences, and its connection to the three-dimensional organization of genomes was investigated by correlation analyses of completely sequenced chromosomes from Viroids, Archaea, Bacteria, Arabidopsis

  13. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  14. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames...

  15. Investigation of genome sequences within the family Pasteurellaceae

    Angen, Øystein; Ussery, David

    Introduction The bacterial genome sequences are now available for an increasing number of strains within the family Pasteurellaceae. At present, 24 Pasteurellaceae genomes are publicly available through internet databases, and another 40 genomes are being sequenced. This investigation will describe...... the core genome for both the family Pasteurellaceae and for the species Haemophilus influenzae. Methods Twenty genome sequences from the following species were included: Haemophilus influenzae (11 strains), Haemophilus ducreyi (1 strain), Histophilus somni (2 strains), Haemophilus parasuis (1 strain......), Actinobacillus pleuropneumoniae (2 strains), Actinobacillus succinogenes (1 strain), Mannheimia succiniciproducens (1 strain), and Pasteurella multocida (1 strain). The predicted proteins for each genome were BLASTed against each other, and a set of conserved core gene families was determined as described...

  16. Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.

    Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

    2015-01-01

    Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes.

  17. Reference genome sequence of the model plant Setaria.

    Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

    2012-05-13

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  18. Reference genome sequence of the model plant Setaria

    Bennetzen, Jeffrey L [ORNL; Schmutz, Jeremy [Hudson Alpha Institute of Biotechnology; Wang, Hao [University of Georgia, Athens, GA; Percifield, Ryan [University of Georgia, Athens, GA; Hawkins, Jennifer [University of Georgia, Athens, GA; Pontaroli, Ana C. [University of Georgia, Athens, GA; Estep, Matt [University of Georgia, Athens, GA; Feng, Liang [University of Georgia, Athens, GA; Vaughn, Justin N [ORNL; Grimwood, Jane [Hudson Alpha Institute of Biotechnology; Jenkins, Jerry [Hudson Alpha Institute of Biotechnology; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Lindquist, Erika [U.S. Department of Energy, Joint Genome Institute; Hellsten, Uffe [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Wang, Xuewen [University of Georgia, Athens, GA; Wu, Xiaomei [University of Georgia, Athens, GA; Mitros, Therese [University of California, Berkeley; Triplett, Jimmy [University of Missouri, St. Louis; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Mauro-Herrera, Margarita [Oklahoma State University; Wang, Lin [Cornell University; Li, Pinghua [Cornell University; Sharma, Manoj [University of California, Davis; Sharma, Rita [University of California, Davis; Ronald, Pamela [University of California, Davis; Panaud, Olivier [Universite de Perpignan, Perpignan, France; Kellogg, Elizabeth A. [University of Missouri, St. Louis; Brutnell, Thomas P. [Cornell University; Doust, Andrew N. [Oklahoma State University; Tuskan, Gerald A [ORNL; Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Devos, Katrien M [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  19. Reference genome sequence of the model plant Setaria

    Bennetzen, Jeffrey L [ORNL; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Tuskan, Gerald A [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  20. Acyclovir and Transmission of HIV-1 from Persons Infected with HIV-1 and HSV-2

    Celum, Connie; Wald, Anna; Lingappa, Jairam R.; Magaret, Amalia S.; Wang, Richard S.; Mugo, Nelly; Mujugira, Andrew; Baeten, Jared M.; Mullins, James I.; Hughes, James P.; Bukusi, Elizabeth A.; Cohen, Craig R.; Katabira, Elly; Ronald, Allan; Kiarie, James; Farquhar, Carey; Stewart, Grace John; Makhema, Joseph; Essex, Myron; Were, Edwin; Fife, Kenneth H.; de Bruyn, Guy; Gray, Glenda E.; McIntyre, James A.; Manongi, Rachel; Kapiga, Saidi; Coetzee, David; Allen, Susan; Inambao, Mubiana; Kayitenkore, Kayitesi; Karita, Etienne; Kanweka, William; Delany, Sinead; Rees, Helen; Vwalika, Bellington; Stevens, Wendy; Campbell, Mary S.; Thomas, Katherine K.; Coombs, Robert W.; Morrow, Rhoda; Whittington, William L.H.; McElrath, M. Juliana; Barnes, Linda; Ridzon, Renee; Corey, Lawrence

    2010-01-01

    BACKGROUND Most persons who are infected with human immunodeficiency virus type 1 (HIV-1) are also infected with herpes simplex virus type 2 (HSV-2), which is frequently reactivated and is associated with increased plasma and genital levels of HIV-1. Therapy to suppress HSV-2 reduces the frequency of reactivation of HSV-2 as well as HIV-1 levels, suggesting that suppression of HSV-2 may reduce the risk of transmission of HIV-1. METHODS We conducted a randomized, placebo-controlled trial of suppressive therapy for HSV-2 (acyclovir at a dose of 400 mg orally twice daily) in couples in which only one of the partners was seropositive for HIV-1 (CD4 count, ≥250 cells per cubic millimeter) and that partner was also infected with HSV-2 and was not taking antiretroviral therapy at the time of enrollment. The primary end point was transmission of HIV-1 to the partner who was not initially infected with HIV-1; linkage of transmissions was assessed by means of genetic sequencing of viruses. RESULTS A total of 3408 couples were enrolled at 14 sites in Africa. Of the partners who were infected with HIV-1, 68% were women, and the baseline median CD4 count was 462 cells per cubic millimeter. Of 132 HIV-1 seroconversions that occurred after randomization (an incidence of 2.7 per 100 person-years), 84 were linked within couples by viral sequencing: 41 in the acyclovir group and 43 in the placebo group (hazard ratio with acyclovir, 0.92, 95% confidence interval [CI], 0.60 to 1.41; P = 0.69). Suppression with acyclovir reduced the mean plasma concentration of HIV-1 by 0.25 log10 copies per milliliter (95% CI, 0.22 to 0.29; P<0.001) and the occurrence of HSV-2–positive genital ulcers by 73% (risk ratio, 0.27; 95% CI, 0.20 to 0.36; P<0.001). A total of 92% of the partners infected with HIV-1 and 84% of the partners not infected with HIV-1 remained in the study for 24 months. The level of adherence to the dispensed study drug was 96%. No serious adverse events related to acyclovir

  1. The allosteric HIV-1 integrase inhibitor BI-D affects virion maturation but does not influence packaging of a functional RNA genome

    van Bel, Nikki; van der Velden, Yme; Bonnard, Damien; Le Rouzic, Erwann; Das, Atze T.; Benarous, Richard; Berkhout, Ben

    2014-01-01

    The viral integrase (IN) is an essential protein for HIV-1 replication. IN inserts the viral dsDNA into the host chromosome, thereby aided by the cellular co-factor LEDGF/p75. Recently a new class of integrase inhibitors was described: allosteric IN inhibitors (ALLINIs). Although designed to

  2. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  3. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Jando, Marlen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  4. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Oxford Nanopore MinION Sequencing and Genome Assembly

    Hengyun Lu

    2016-10-01

    Full Text Available The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS technology. The third-generation sequencing (TGS technology, led by Pacific Biosciences (PacBio, is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT. MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assembly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.

  6. Puzzling sequences: studying microbial genomes from 'Ötzi'

    Rattei, T.

    2012-01-01

    Ancient remains, and mummies in particular, are of central value for archaeological research. The Tyrolean iceman “Ötzi” was conserved in a glacier of the Ötztal Alps about 5000 years ago. Aside from morphological and phenotypical classification, the determination of DNA sequences and the subsequent genome analyses have been first applied to mitochondrial DNA and then been extended to genomic DNA. Typically also ancient microbial DNA is sequenced. These sequences allow the identification of pathogens as well as studying the evolution of microorganisms. The talk will explain the metagenomic aspects of the “Ötzi” genome project and discuss the first results. (author)

  7. Early low-titer neutralizing antibodies impede HIV-1 replication and select for virus escape.

    Katharine J Bar

    Full Text Available Single genome sequencing of early HIV-1 genomes provides a sensitive, dynamic assessment of virus evolution and insight into the earliest anti-viral immune responses in vivo. By using this approach, together with deep sequencing, site-directed mutagenesis, antibody adsorptions and virus-entry assays, we found evidence in three subjects of neutralizing antibody (Nab responses as early as 2 weeks post-seroconversion, with Nab titers as low as 1∶20 to 1∶50 (IC(50 selecting for virus escape. In each of the subjects, Nabs targeted different regions of the HIV-1 envelope (Env in a strain-specific, conformationally sensitive manner. In subject CH40, virus escape was first mediated by mutations in the V1 region of the Env, followed by V3. HIV-1 specific monoclonal antibodies from this subject mapped to an immunodominant region at the base of V3 and exhibited neutralizing patterns indistinguishable from polyclonal antibody responses, indicating V1-V3 interactions within the Env trimer. In subject CH77, escape mutations mapped to the V2 region of Env, several of which selected for alterations of glycosylation. And in subject CH58, escape mutations mapped to the Env outer domain. In all three subjects, initial Nab recognition was followed by sequential rounds of virus escape and Nab elicitation, with Nab escape variants exhibiting variable costs to replication fitness. Although delayed in comparison with autologous CD8 T-cell responses, our findings show that Nabs appear earlier in HIV-1 infection than previously recognized, target diverse sites on HIV-1 Env, and impede virus replication at surprisingly low titers. The unexpected in vivo sensitivity of early transmitted/founder virus to Nabs raises the possibility that similarly low concentrations of vaccine-induced Nabs could impair virus acquisition in natural HIV-1 transmission, where the risk of infection is low and the number of viruses responsible for transmission and productive clinical

  8. Similar Ratios of Introns to Intergenic Sequence across Animal Genomes.

    Francis, Warren R; Wörheide, Gert

    2017-06-01

    One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. MIPS: a database for genomes and protein sequences.

    Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).

  10. Genome Sequence of Australian Indigenous Wine Yeast Torulaspora delbrueckii COFT1 Using Nanopore Sequencing.

    Tondini, Federico; Jiranek, Vladimir; Grbin, Paul R; Onetto, Cristobal A

    2018-04-26

    Here, we report the first sequenced genome of an indigenous Australian wine isolate of Torulaspora delbrueckii using the Oxford Nanopore MinION and Illumina HiSeq sequencing platforms. The genome size is 9.4 Mb and contains 4,831 genes. Copyright © 2018 Tondini et al.

  11. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  12. Whole genome shotgun sequencing of Indian strains of Streptococcus agalactiae

    Balaji Veeraraghavan

    2017-12-01

    Full Text Available Group B streptococcus is known as a leading cause of neonatal infections in developing countries. The present study describes the whole genome shotgun sequences of four Group B Streptococcus (GBS isolates. Molecular data on clonality is lacking for GBS in India. The present genome report will add important information on the scarce genome data of GBS and will help in deriving comparative genome studies of GBS isolates at global level. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession numbers NHPL00000000 – NHPO00000000.

  13. HIV-1 Nef control of cell signalling molecules: multiple strategies

    HIV-1 has at its disposal numerous proteins encoded by its genome which provide the required arsenal to establish and maintain infection in its host for a considerable number of years. One of the most important and enigmatic of these proteins is Nef. The Nef protein of HIV-1 plays a fundamental role in the virus life cycle.

  14. Simple sequence repeats in mycobacterial genomes

    2006-12-18

    Dec 18, 2006 ... Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these ...

  15. The Release 6 reference sequence of the Drosophila melanogaster genome.

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. © 2015 Hoskins et al.; Published by Cold Spring Harbor Laboratory Press.

  16. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal

  17. Getting complete genomes from complex samples using nanopore sequencing

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Albertsen, Mads

    Background Short read DNA sequencing and metagenomic binning workflows have made it possible to extract bacterial genome bins from environmental microbial samples containing hundreds to thousands of different species. However, these genome bins often do not represent complete genomes......, as they are mostly fragmented, incomplete and often contaminated with foreign DNA. The value of these `draft genomes` have limited, lasting value to the scientific community, as gene synteny is broken and there is some uncertainty of what is missing1. The genetic material most often missed is important multi......-copy and/or conserved marker genes such as the 16S rRNA gene, as sequence micro-heterogeneity prevents assembly of these genes in the de novo assembly. However, long read sequencing technologies are emerging promising an end to fragmented genome assemblies2. Experimental design We extracted DNA from a full...

  18. Using nanopore sequencing to get complete genomes from complex samples

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Nielsen, Per Halkjær

    The advantages of “next generation sequencing” has come at the cost of genome finishing. The dominant sequencing technology provides short reads of 150-300 bp, which has made genome assembly very difficult as the reads do not span important repeat regions. Genomes have thus been added...... to the databases as fragmented assemblies and not as finished contigs that resemble the chromosomes in which the DNA is organised within the cells. This is especially troublesome for genomes derived from complex metagenome sequencing. Databases with incomplete genomes can lead to false conclusions about...... the absence of genes and functional predictions of the organisms. Furthermore, it is common that repetitive elements and marker genes such as the 16S rRNA gene are missing completely from these genome bins. Using nanopore long reads, we demonstrate that it is possible to span these regions and make complete...

  19. Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

    So Mee Kwon

    2012-06-01

    Full Text Available The explosive development of genomics technologies including microarrays and next generation sequencing (NGS has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research.

  20. Draft Genome Sequence of Type Strain Streptococcus gordonii ATCC 10558

    Rasmussen, Louise Hesselbjerg; Dargis, Rimtas; Christensen, Jens Jørgen Elmer

    2016-01-01

    Streptococcus gordonii ATCC 10558T was isolated from a patient with infective endocarditis in 1946 and announced as a type strain in 1989. Here, we report the 2,154,510-bp draft genome sequence of S. gordonii ATCC 10558T. This sequence will contribute to knowledge about the pathogenesis of infect......Streptococcus gordonii ATCC 10558T was isolated from a patient with infective endocarditis in 1946 and announced as a type strain in 1989. Here, we report the 2,154,510-bp draft genome sequence of S. gordonii ATCC 10558T. This sequence will contribute to knowledge about the pathogenesis...

  1. RESEARCH NOTE Genome-based exome-sequencing analysis ...

    Navya

    2017-02-22

    Feb 22, 2017 ... Genome-based exome-sequencing analysis identifies GYG1, DIS3L, DDRGK1 genes ... Cardiology Division, Department of Internal Medicine, Severance .... with p values of <0.05 byanalyzing differences in allele distribution.

  2. Complete Genome Sequence of Mycobacterium phlei Type Strain RIVM601174

    Abdallah, A. M.; Rashid, M.; Adroub, S. A.; Arnoux, M.; Ali, Shahjahan; van Soolingen, D.; Bitter, W.; Pain, Arnab

    2012-01-01

    Mycobacterium phlei is a rapidly growing nontuberculous Mycobacterium species that is typically nonpathogenic, with few reported cases of human disease. Here we report the whole genome sequence of M. phlei type strain RIVM601174.

  3. Complete genome sequences of six strains of the genus methylobacterium

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; Farhan Ul Haque, Muhammad [CNRS, Strasbourg, France; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Aguero, Fernan [Universidad Nacional de General San Martin; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  4. Complete Genome Sequences of Six Strains of the Genus Methylobacterium

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; UI Hague, Muhammad Farhan [University of Strasbourg; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanov, Pavel S. [University of Wyoming, Laramie; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  5. Complete Genome Sequence of Mycobacterium phlei Type Strain RIVM601174

    Abdallah, A. M.

    2012-05-24

    Mycobacterium phlei is a rapidly growing nontuberculous Mycobacterium species that is typically nonpathogenic, with few reported cases of human disease. Here we report the whole genome sequence of M. phlei type strain RIVM601174.

  6. Determining and comparing protein function in Bacterial genome sequences

    Vesth, Tammi Camilla

    of this class have very little homology to other known genomes making functional annotation based on sequence similarity very difficult. Inspired in part by this analysis, an approach for comparative functional annotation was created based public sequenced genomes, CMGfunc. Functionally related groups......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... annotation of genes – the descriptions assigned to genes that describe the likely function of the encoded proteins. This process is limited by several factors, including the definition of a function which can be more or less specific as well as how many genes can actually be assigned a function based...

  7. Bos taurus strain:dairy beef (cattle): 1000 Bull Genomes Run 2, Bovine Whole Genome Sequence

    Bouwman, A.C.; Daetwyler, H.D.; Chamberlain, Amanda J.; Ponce, Carla Hurtado; Sargolzaei, Mehdi; Schenkel, Flavio S.; Sahana, Goutam; Govignon-Gion, Armelle; Boitard, Simon; Dolezal, Marlies; Pausch, Hubert; Brøndum, Rasmus F.; Bowman, Phil J.; Thomsen, Bo; Guldbrandtsen, Bernt; Lund, Mogens S.; Servin, Bertrand; Garrick, Dorian J.; Reecy, James M.; Vilkki, Johanna; Bagnato, Alessandro; Wang, Min; Hoff, Jesse L.; Schnabel, Robert D.; Taylor, Jeremy F.; Vinkhuyzen, Anna A.E.; Panitz, Frank; Bendixen, Christian; Holm, Lars-Erik; Gredler, Birgit; Hozé, Chris; Boussaha, Mekki; Sanchez, Marie Pierre; Rocha, Dominique; Capitan, Aurelien; Tribout, Thierry; Barbat, Anne; Croiseau, Pascal; Drögemüller, Cord; Jagannathan, Vidhya; Vander Jagt, Christy; Crowley, John J.; Bieber, Anna; Purfield, Deirdre C.; Berry, Donagh P.; Emmerling, Reiner; Götz, Kay Uwe; Frischknecht, Mirjam; Russ, Ingolf; Sölkner, Johann; Tassell, van Curtis P.; Fries, Ruedi; Stothard, Paul; Veerkamp, R.F.; Boichard, Didier; Goddard, Mike E.; Hayes, Ben J.

    2014-01-01

    Whole genome sequence data (BAM format) of 234 bovine individuals aligned to UMD3.1. The aim of the study was to identify genetic variants (SNPs and indels) for downstream analysis such as imputation, GWAS, and detection of lethal recessives. Additional sequences for later 1000 bull genomes runs can

  8. Comparative genomics beyond sequence-based alignments

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  9. Intra-species sequence comparisons for annotating genomes

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  10. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  11. Complete Genome Sequence of Bifidobacterium bifidum S17▿

    Zhurina, Daria; Zomer, Aldert; Gleinser, Marita; Brancaccio, Vincenco Francesco; Auchter, Marc; Waidmann, Mark S.; Westermann, Christina; van Sinderen, Douwe; Riedel, Christian U.

    2011-01-01

    Here, we report on the first completely annotated genome sequence of a Bifidobacterium bifidum strain. B. bifidum S17, isolated from feces of a breast-fed infant, was shown to strongly adhere to intestinal epithelial cells and has potent anti-inflammatory activity in vitro and in vivo. The genome sequence will provide new insights into the biology of this potential probiotic organism and allow for the characterization of the molecular mechanisms underlying its beneficial properties. PMID:21037011

  12. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martín, Marta

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  13. Draft genome sequence of Therminicola potens strain JR

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  14. Draft genome sequence of Penicillium marneffei strain PM1.

    Woo, Patrick C Y; Lau, Susanna K P; Liu, Bin; Cai, James J; Chong, Ken T K; Tse, Herman; Kao, Richard Y T; Chan, Che-Man; Chow, Wang-Ngai; Yuen, Kwok-Yung

    2011-12-01

    Penicillium marneffei is the most important thermal dimorphic, pathogenic fungus endemic in China and Southeast Asia and is particularly important in HIV-positive patients. We report the 28,887,485-bp draft genome sequence of P. marneffei, which contains its complete mitochondrial genome, sexual cycle genes, a high diversity of Mp1p homologues, and polyketide synthase genes.

  15. Complete Genome Sequence of Pediococcus pentosaceus Strain SL4

    Dantoft, Shruti Harnal; Bielak, Eliza Maria; Seo, Jae-Gu

    2013-01-01

    Pediococcus pentosaceus SL4 was isolated from a Korean fermented vegetable product, kimchi. We report here the whole-genome sequence (WGS) of P. pentosaceus SL4. The genome consists of a 1.79-Mb circular chromosome (G+C content of 37.3%) and seven distinct plasmids ranging in size from 4 kb to 50...

  16. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    Neave, Matthew J.

    2014-08-14

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  17. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  18. Whole-genome sequence-based analysis of thyroid function

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  19. The sequence of the Helicoverpa armigera single nucleocapsid nucleopolyhedrovirus genome

    Chen, X.; IJkel, W.F.J.; Tarchini, R.; Sun, X.; Sandbrink, H.; Wang, H.; Peters, S.; Zuidema, D.; Klein Lankhorst, R.; Vlak, J.M.; Hu, Z.

    2001-01-01

    The nucleotide sequence of the Helicoverpa armigera single-nucleocapsid nucleopolyhedrovirus (HaSNPV) DNA genome was determined and analysed. The circular genome encompasses 131 403 bp, has a G C content of 39.1 molnd contains five homologous regions with a unique pattern of repeats.

  20. Draft Genome Sequence of Escherichia coli K-12 (ATCC 10798)

    Dimitrova, Daniela; Engelbrecht, Kathleen C.; Putonti, Catherine; Koenig, David W.; Wolfe, Alan J.

    2017-01-01

    ABSTRACT Here, we present the draft genome sequence of Escherichia coli ATCC 10798. E.?coli ATCC 10798 is a K-12 strain, one of the most well-studied model microorganisms. The size of the genome was 4,685,496?bp, with a G+C content of 50.70%. This assembly consists of 62 contigs and the F plasmid.

  1. Genome sequences of Listeria monocytogenes strains with resistance to arsenic

    Listeria monocytogenes frequently exhibits resistance to arsenic. We report here the draft genome sequences of eight genetically diverse arsenic-resistant L. monocytogenes strains from human listeriosis and food-associated environments. Availability of these genomes would help to elucidate the role ...

  2. A bibliometric analysis of global research on genome sequencing ...

    The results show that disease and protein related researches were the leading research focuses, and comparative genomics and evolution related research had strong potential in the near future. Key words: Genome sequencing, research trend, scientometrics, science citation index expanded (SCI-Expanded), word cluster ...

  3. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    Neave, Matthew J.; Michell, Craig; Apprill, Amy; Voolstra, Christian R.

    2014-01-01

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  4. Genome sequence of Chinese porcine parvovirus strain PPV2010.

    Cui, Jin; Wang, Xin; Ren, Yudong; Cui, Shangjin; Li, Guangxing; Ren, Xiaofeng

    2012-02-01

    Porcine parvovirus (PPV) isolate PPV2010 has recently emerged in China. Herein, we analyze the complete genome sequence of PPV2010. Our results indicate that the genome of PPV2010 bears mixed characteristics of virulent PPV and vaccine strains. Importantly, PPV2010 has the potential to be a naturally attenuated candidate vaccine strain.

  5. Genome Sequence of Chinese Porcine Parvovirus Strain PPV2010

    Cui, Jin; Wang, Xin; Ren, Yudong; Cui, Shangjin; Li, Guangxing; Ren, Xiaofeng

    2012-01-01

    Porcine parvovirus (PPV) isolate PPV2010 has recently emerged in China. Herein, we analyze the complete genome sequence of PPV2010. Our results indicate that the genome of PPV2010 bears mixed characteristics of virulent PPV and vaccine strains. Importantly, PPV2010 has the potential to be a naturally attenuated candidate vaccine strain.

  6. Draft genome sequence of the silver pomfret fish, Pampus argenteus.

    AlMomin, Sabah; Kumar, Vinod; Al-Amad, Sami; Al-Hussaini, Mohsen; Dashti, Talal; Al-Enezi, Khaznah; Akbar, Abrar

    2016-01-01

    Silver pomfret, Pampus argenteus, is a fish species from coastal waters. Despite its high commercial value, this edible fish has not been sequenced. Hence, its genetic and genomic studies have been limited. We report the first draft genome sequence of the silver pomfret obtained using a Next Generation Sequencing (NGS) technology. We assembled 38.7 Gb of nucleotides into scaffolds of 350 Mb with N50 of about 1.5 kb, using high quality paired end reads. These scaffolds represent 63.7% of the estimated silver pomfret genome length. The newly sequenced and assembled genome has 11.06% repetitive DNA regions, and this percentage is comparable to that of the tilapia genome. The genome analysis predicted 16 322 genes. About 91% of these genes showed homology with known proteins. Many gene clusters were annotated to protein and fatty-acid metabolism pathways that may be important in the context of the meat texture and immune system developmental processes. The reference genome can pave the way for the identification of many other genomic features that could improve breeding and population-management strategies, and it can also help characterize the genetic diversity of P. argenteus.

  7. Finished Genome Sequence of Collimonas arenae Cal35

    Wu, Je-Jia; de Jager, Victor; Deng, Wen-ling; Leveau, Johan

    2015-01-01

    We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of

  8. Complete genome sequence of pronghorn virus, a pestivirus

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  9. Genome sequence of the olive tree, Olea europaea.

    Cruz, Fernando; Julca, Irene; Gómez-Garrido, Jèssica; Loska, Damian; Marcet-Houben, Marina; Cano, Emilio; Galán, Beatriz; Frias, Leonor; Ribeca, Paolo; Derdak, Sophia; Gut, Marta; Sánchez-Fernández, Manuel; García, Jose Luis; Gut, Ivo G; Vargas, Pablo; Alioto, Tyler S; Gabaldón, Toni

    2016-06-27

    The Mediterranean olive tree (Olea europaea subsp. europaea) was one of the first trees to be domesticated and is currently of major agricultural importance in the Mediterranean region as the source of olive oil. The molecular bases underlying the phenotypic differences among domesticated cultivars, or between domesticated olive trees and their wild relatives, remain poorly understood. Both wild and cultivated olive trees have 46 chromosomes (2n). A total of 543 Gb of raw DNA sequence from whole genome shotgun sequencing, and a fosmid library containing 155,000 clones from a 1,000+ year-old olive tree (cv. Farga) were generated by Illumina sequencing using different combinations of mate-pair and pair-end libraries. Assembly gave a final genome with a scaffold N50 of 443 kb, and a total length of 1.31 Gb, which represents 95 % of the estimated genome length (1.38 Gb). In addition, the associated fungus Aureobasidium pullulans was partially sequenced. Genome annotation, assisted by RNA sequencing from leaf, root, and fruit tissues at various stages, resulted in 56,349 unique protein coding genes, suggesting recent genomic expansion. Genome completeness, as estimated using the CEGMA pipeline, reached 98.79 %. The assembled draft genome of O. europaea will provide a valuable resource for the study of the evolution and domestication processes of this important tree, and allow determination of the genetic bases of key phenotypic traits. Moreover, it will enhance breeding programs and the formation of new varieties.

  10. Complete sequence of the mitochondrial genome of ...

    products were purified using the DNA Gel Extraction Kit. (Tiangen, Shanghai, China). The purified products obtained ..... Base composition of O. rubicundus mitochondrial genome. .... the help of fish sampled and identified by morphology.

  11. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  12. First fungal genome sequence from Africa: A preliminary analysis

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  13. What can we learn about lyssavirus genomes using 454 sequencing?

    Höper, Dirk; Finke, Stefan; Freuling, Conrad M; Hoffmann, Bernd; Beer, Martin

    2012-01-01

    The main task of the individual project number four"Whole genome sequencing, virus-host adaptation, and molecular epidemiological analyses of lyssaviruses "within the network" Lyssaviruses--a potential re-emerging public health threat" is to provide high quality complete genome sequences from lyssaviruses. These sequences are analysed in-depth with regard to the diversity of the viral populations as to both quasi-species and so-called defective interfering RNAs. Moreover, the sequence data will facilitate further epidemiological analyses, will provide insight into the evolution of lyssaviruses and will be the basis for the design of novel nucleic acid based diagnostics. The first results presented here indicate that not only high quality full-length lyssavirus genome sequences can be generated, but indeed efficient analysis of the viral population gets feasible.

  14. Identifying driver mutations in sequenced cancer genomes

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, nois...... patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer....

  15. Ancient Human Genome Sequence of an Extinct Palaeo-Eskimo

    Rasmussen, Morten; Li, Yingrui; Lindgreen, Stinus

    2010-01-01

    We report here the genome sequence of an ancient human. Obtained from approximately 4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20x, we recover 79% of the diploid genome...... possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence...

  16. Biased distribution of DNA uptake sequences towards genome maintenance genes

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  17. Complete genome sequence of Serratia plymuthica strain AS12

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

    2012-01-01

    A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

  18. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms...

  19. Specialized microbial databases for inductive exploration of microbial genome sequences

    Cabau Cédric

    2005-02-01

    Full Text Available Abstract Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore http://bioinfo.hku.hk/genochore.html, a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis associated to related organisms for comparison.

  20. The genome sequence of the model ascomycete fungus Podospora anserina

    Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne Gj; Henrissat, Bernard; Khoury, Riyad El; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

    2008-01-01

    BACKGROUND: The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. RESULTS: We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed

  1. Sequencing and analysis of an Irish human genome.

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  2. Complete genome sequences of six measles virus strains

    Phan, M.V.T. (My V.T.); C.M.E. Schapendonk (Claudia); B.B. Oude Munnink (Bas B.); M.P.G. Koopmans D.V.M. (Marion); R.L. de Swart (Rik); Cotten, M. (Matthew)

    2018-01-01

    textabstractGenetic characterization of wild-type measles virus (MV) strains is a critical component of measles surveillance and molecular epidemiology. We have obtained complete genome sequences of six MV strains belonging to different genotypes, using random-primed next generation sequencing.

  3. Genome sequence of Stachybotrys chartarum Strain 51-11

    Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina Hiseq 2000 and PacBio long read technology. Since Stachybotrys chartarum has been implicated in health impacts within water-damaged buildings, any information extracted from the geno...

  4. Complete genome sequence of a novel pestivirus from sheep.

    Becher, Paul; Schmeiser, Stefanie; Oguzoglu, Tuba Cigdem; Postel, Alexander

    2012-10-01

    We report here the complete genome sequence of pestivirus strain Aydin/04-TR, which is the prototype of a group of similar viruses currently present in sheep and goats in Turkey. Sequence data from this virus showed that it clusters separately from the established and previously proposed tentative pestivirus species.

  5. Complete Genome Sequence of a Novel Pestivirus from Sheep

    Becher, Paul; Schmeiser, Stefanie; Oguzoglu, Tuba Cigdem; Postel, Alexander

    2012-01-01

    We report here the complete genome sequence of pestivirus strain Aydin/04-TR, which is the prototype of a group of similar viruses currently present in sheep and goats in Turkey. Sequence data from this virus showed that it clusters separately from the established and previously proposed tentative pestivirus species.

  6. Pig genome sequence - analysis and publication strategy

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol

    2010-01-01

    preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. CONCLUSIONS...

  7. HIV-1 genetic diversity and its distribution characteristics among newly diagnosed HIV-1 individuals in Hebei province, China.

    Lu, Xinli; Zhao, Cuiying; Wang, Wei; Nie, Chenxi; Zhang, Yuqi; Zhao, Hongru; Chen, Suliang; Cui, Ze

    2016-01-01

    Since the first HIV-1 case in 1989, Hebei province has presented a clearly rising trend of HIV-1 prevalence, and HIV-1 genetic diversity has become the vital barrier to HIV prevention and control in this area. To obtain detailed information of HIV-1 spread in different populations and in different areas of Hebei, a cross-sectional HIV-1 molecular epidemiological investigation was performed across the province. Blood samples of 154 newly diagnosed HIV-1 individuals were collected from ten prefectures in Hebei using stratified sampling. Partial gag and env genes were amplified and sequenced. HIV-1 genotypes were identified by phylogenetic tree analyses. Among the 139 subjects genotyped, six HIV-1 subtypes were identified successfully, including subtype B (41.0 %), CRF01_AE (40.3 %), CRF07_BC (11.5 %), CRF08_BC (4.3 %), unique recombinant forms (URFs) (1.4 %) and subtype C (1.4 %). Subtype B was identified as the most frequent subtype. Two URF recombination patterns were the same as CRF01_AE/B. HIV-1 genotype distribution showed a significant statistical difference in different demographic characteristics, such as source (P  0.05). The differences in HIV-1 genotype distribution were closely associated with transmission routes. Particularly, all six subtype strains were found in heterosexuals, showing that HIV-1 has spread from the high-risk populations to the general populations in Hebei, China. In addition, CRF01_AE instead of subtype B has become the major strain of HIV-1 infection among homosexuals. Our study revealed HIV-1 evolution and genotype distribution by investigating newly diagnosed HIV-1 individuals in Hebei, China. This study provides important information to enhance the strategic plan for HIV prevention and control in China.

  8. Human genome and genetic sequencing research and informed consent

    Iwakawa, Mayumi

    2003-01-01

    On March 29, 2001, the Ethical Guidelines for Human Genome and Genetic Sequencing Research were established. They have intended to serve as ethical guidelines for all human genome and genetic sequencing research practice, for the purpose of upholding respect for human dignity and rights and enforcing use of proper methods in the pursuit of human genome and genetic sequencing research, with the understanding and cooperation of the public. The RadGenomics Project has prepared a research protocol and informed consent document that follow these ethical guidelines. We have endeavored to protect the privacy of individual information, and have established a procedure for examination of research practices by an ethics committee. Here we report our procedure in order to offer this concept to the patients. (authors)

  9. Getting complete genomes from complex samples using nanopore sequencing

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Albertsen, Mads

    Short read sequencing and metagenomic binning workflows have made it possible to extract bacterial genome bins from environmental microbial samples containing hundreds to thousands of different species. However, these genome bins often do not represent complete genomes, as they are mostly...... fragmented, incomplete and often contaminated with foreign DNA and with no robust strategies to validate the quality. The value of these `draft genomes` have limited, lasting value to the scientific community, as gene synteny is broken and the uncertainty of what is missing. The genetic material most often...... missed is important multi-copy and/or conserved marker genes such as the 16S rRNA gene, as sequence micro-heterogeneity prevents assembly of these genes in the de novo assembly. We demonstrate that using nanopore long reads it is now possible to overcome these issues and make complete genomes from...

  10. Complete genome sequence of the myxobacterium Sorangium cellulosum

    Schneiker, S; Perlova, O; Kaiser, O

    2007-01-01

    The genus Sorangium synthesizes approximately half of the secondary metabolites isolated from myxobacteria, including the anti-cancer metabolite epothilone. We report the complete genome sequence of the model Sorangium strain S. cellulosum Soce56, which produces several natural products and has...... morphological and physiological properties typical of the genus. The circular genome, comprising 13,033,779 base pairs, is the largest bacterial genome sequenced to date. No global synteny with the genome of Myxococcus xanthus is apparent, revealing an unanticipated level of divergence between...... these myxobacteria. A large percentage of the genome is devoted to regulation, particularly post-translational phosphorylation, which probably supports the strain's complex, social lifestyle. This regulatory network includes the highest number of eukaryotic protein kinase-like kinases discovered in any organism...

  11. Genomic Sequencing of Single Microbial Cells from Environmental Samples

    Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S.

    2008-02-01

    Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification, Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing.

  12. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  13. Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

    2011-01-01

    Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of

  14. Sequencing of a new target genome: the Pediculus humanus humanus (Phthiraptera: Pediculidae) genome project.

    Pittendrigh, B R; Clark, J M; Johnston, J S; Lee, S H; Romero-Severson, J; Dasch, G A

    2006-11-01

    The human body louse, Pediculus humanus humanus (L.), and the human head louse, Pediculus humanus capitis, belong to the hemimetabolous order Phthiraptera. The body louse is the primary vector that transmits the bacterial agents of louse-borne relapsing fever, trench fever, and epidemic typhus. The genomes of the bacterial causative agents of several of these aforementioned diseases have been sequenced. Thus, determining the body louse genome will enhance studies of host-vector-pathogen interactions. Although not important as a major disease vector, head lice are of major social concern. Resistance to traditional pesticides used to control head and body lice have developed. It is imperative that new molecular targets be discovered for the development of novel compounds to control these insects. No complete genome sequence exists for a hemimetabolous insect species primarily because hemimetabolous insects often have large (2000 Mb) to very large (up to 16,300 Mb) genomes. Fortuitously, we determined that the human body louse has one of the smallest genome sizes known in insects, suggesting it may be a suitable choice as a minimal hemimetabolous genome in which many genes have been eliminated during its adaptation to human parasitism. Because many louse species infest birds and mammals, the body louse genome-sequencing project will facilitate studies of their comparative genomics. A 6-8X coverage of the body louse genome, plus sequenced expressed sequence tags, should provide the entomological, evolutionary biology, medical, and public health communities with useful genetic information.

  15. The C-terminal tail of the gp41 transmembrane envelope glycoprotein of HIV-1 clades A, B, C, and D may exist in two conformations: an analysis of sequence, structure, and function

    Hollier, Mark J.; Dimmock, Nigel J.

    2005-01-01

    In addition to the major ectodomain, the gp41 transmembrane glycoprotein of HIV-1 is now known to have a minor ectodomain that is part of the long C-terminal tail. Both ectodomains are highly antigenic, carry neutralizing and non-neutralizing epitopes, and are involved in virus-mediated fusion activity. However, data have so far been biologically based, and derived solely from T cell line-adapted (TCLA), B clade viruses. Here we have carried out sequence and theoretically based structural analyses of 357 gp41 C-terminal sequences of mainly primary isolates of HIV-1 clades A, B, C, and D. Data show that all these viruses have the potential to form a tail loop structure (the minor ectodomain) supported by three, β-sheet, membrane-spanning domains (MSDs). This means that the first (N-terminal) tyrosine-based sorting signal of the gp41 tail is situated outside the cell membrane and is non-functional, and that gp41 that reaches the cell surface may be recycled back into the cytoplasm through the activity of the second tyrosine-sorting signal. However, we suggest that only a minority of cell-associated gp41 molecules - those destined for incorporation into virions - has 3 MSDs and the minor ectodomain. Most intracellular gp41 has the conventional single MSD, no minor ectodomain, a functional first tyrosine-based sorting signal, and in line with current thinking is degraded intracellularly. The gp41 structural diversity suggested here can be viewed as an evolutionary strategy to minimize HIV-1 envelope glycoprotein expression on the cell surface, and hence possible cytotoxicity and immune attack on the infected cell

  16. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  17. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. © The Author 2013. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  18. The genome sequence of four isolates from the family Lichtheimiaceae.

    Chibucos, Marcus C; Etienne, Kizee A; Orvis, Joshua; Lee, Hongkyu; Daugherty, Sean; Lockhart, Shawn R; Ibrahim, Ashraf S; Bruno, Vincent M

    2015-07-01

    This study reports the release of draft genome sequences of two isolates of Lichtheimia corymbifera and two isolates of L. ramosa. Phylogenetic analyses indicate that the two L. corymbifera strains (CDC-B2541 and 008-049) are closely related to the previously sequenced L. corymbifera isolate (FSU 9682) while our two L. ramosa strains CDC-B5399 and CDC-B5792 cluster apart from them. These genome sequences will further the understanding of intraspecies and interspecies genetic variation within the Mucoraceae family of pathogenic fungi. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

    2014-01-01

    Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

  20. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Zheng Ping

    2014-01-01

    Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.

  1. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  2. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  3. Rapid and Accurate Sequencing of Enterovirus Genomes Using MinION Nanopore Sequencer.

    Wang, Ji; Ke, Yue Hua; Zhang, Yong; Huang, Ke Qiang; Wang, Lei; Shen, Xin Xin; Dong, Xiao Ping; Xu, Wen Bo; Ma, Xue Jun

    2017-10-01

    Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  4. Viral load and genomic integration of HPV 16 in cervical samples from HIV-1-infected and uninfected women in Burkina Faso.

    Rousseau, Marie-Noelle Didelot; Costes, Valérie; Konate, Issouf; Nagot, Nicolas; Foulongne, Vincent; Ouedraogo, Abdoulaye; Van de Perre, Philippe; Mayaud, Philippe; Segondy, Michel

    2007-06-01

    The relationships between human papillomavirus type 16 (HPV 16) viral load, HPV 16 integration status, human immunodeficiency virus type 1 (HIV-1) status, and cervical cytology were studied among women enrolled in a cohort of female sex workers in Burkina Faso. The study focused on 24 HPV 16-infected women. The HPV 16 viral load in cervical samples was determined by real-time PCR. Integration ratio was estimated as the ratio between E2 and E6 genes DNA copy numbers. Integrated HPV16 viral load was defined as the product of HPV 16 viral load by the integration ratio. High HPV 16 viral load and high integration ratio were more frequent among women with squamous intraepithelial lesions compared with women with normal cytology (33% vs. 11%, and 33% vs. 0%, respectively), and among women with high-grade squamous intraepithelial lesions compared with women without high-grade squamous intraepithelial lesions (50% vs. 17%, and 50% vs. 11%, respectively). High HPV 16 DNA load, but not high integration ratio, was also more frequent among HIV-1-positive women (39% vs. 9%; and 23% vs. 18%, respectively). The absence of statistical significance of these differences might be explained by the small study sample size. High-integrated HPV 16 DNA load was significantly associated with the presence of high-grade squamous intraepithelial lesions (50% vs. 5%, P = 0.03) in univariate and multivariate analysis (adjusted odds-ratio: 19.05; 95% confidence interval (CI), 1.11-328.3, P = 0.03), but not with HIV-1 or other high-risk HPV types (HR-HPV). Integrated HPV 16 DNA load may be considered as a useful marker of high-grade cervical lesions in HPV 16-infected women. (c) 2007 Wiley-Liss, Inc.

  5. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    Arabi E. keshk

    2014-01-01

    The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between se...

  6. Sequence similarity between the erythrocyte binding domain 1 of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals binding residues for the Duffy Antigen Receptor for Chemokines

    Garry Robert F

    2011-01-01

    Full Text Available Abstract Background The surface glycoprotein (SU, gp120 of the human immunodeficiency virus (HIV must bind to a chemokine receptor, CCR5 or CXCR4, to invade CD4+ cells. Plasmodium vivax uses the Duffy Binding Protein (DBP to bind the Duffy Antigen Receptor for Chemokines (DARC and invade reticulocytes. Results Variable loop 3 (V3 of HIV-1 SU and domain 1 of the Plasmodium vivax DBP share a sequence similarity. The site of amino acid sequence similarity was necessary, but not sufficient, for DARC binding and contained a consensus heparin binding site essential for DARC binding. Both HIV-1 and P. vivax can be blocked from binding to their chemokine receptors by the chemokine, RANTES and its analog AOP-RANTES. Site directed mutagenesis of the heparin binding motif in members of the DBP family, the P. knowlesi alpha, beta and gamma proteins abrogated their binding to erythrocytes. Positively charged residues within domain 1 are required for binding of P. vivax and P. knowlesi erythrocyte binding proteins. Conclusion A heparin binding site motif in members of the DBP family may form part of a conserved erythrocyte receptor binding pocket.

  7. Genome Sequence of the Freshwater Yangtze Finless Porpoise.

    Yuan, Yuan; Zhang, Peijun; Wang, Kun; Liu, Mingzhong; Li, Jing; Zheng, Jingsong; Wang, Ding; Xu, Wenjie; Lin, Mingli; Dong, Lijun; Zhu, Chenglong; Qiu, Qiang; Li, Songhai

    2018-04-16

    The Yangtze finless porpoise ( Neophocaena asiaeorientalis ssp. asiaeorientalis ) is a subspecies of the narrow-ridged finless porpoise ( N. asiaeorientalis ). In total, 714.28 gigabases (Gb) of raw reads were generated by whole-genome sequencing of the Yangtze finless porpoise, using an Illumina HiSeq 2000 platform. After filtering the low-quality and duplicated reads, we assembled a draft genome of 2.22 Gb, with contig N50 and scaffold N50 values of 46.69 kilobases (kb) and 1.71 megabases (Mb), respectively. We identified 887.63 Mb of repetitive sequences and predicted 18,479 protein-coding genes in the assembled genome. The phylogenetic tree showed a relationship between the Yangtze finless porpoise and the Yangtze River dolphin, which diverged approximately 20.84 million years ago. In comparisons with the genomes of 10 other mammals, we detected 44 species-specific gene families, 164 expanded gene families, and 313 positively selected genes in the Yangtze finless porpoise genome. The assembled genome sequence and underlying sequence data are available at the National Center for Biotechnology Information under BioProject accession number PRJNA433603.

  8. Rapid selection of escape mutants by the first CD8 T cell responses in acute HIV-1 infection

    Korber, Bette Tina Marie [Los Alamos National Laboratory

    2008-01-01

    The recent failure of a vaccine that primes T cell responses to control primary HIV-1 infection has raised doubts about the role of CD8+ T cells in early HIV-1 infection. We studied four patients who were identified shortly after HIV-1 infection and before seroconversion. In each patient there was very rapid selection of multiple HIV-1 escape mutants in the transmitted virus by CD8 T cells, including examples of complete fixation of non-synonymous substitutions within 2 weeks. Sequencing by single genome amplification suggested that the high rate of virus replication in acute infection gave a selective advantage to virus molecules that contained simultaneous and gained sequential T cell escape mutations. These observations show that whilst early HIV-1 specific CD8 T cells can act against virus, rapid escape means that these T cell responses are unlikely to benefit the patient and may in part explain why current HIV-1 T cell vaccines may not be protective.

  9. Editing of HIV-1 RNA by the double-stranded RNA deaminase ADAR1 stimulates viral infection

    Doria, Margherita; Neri, Francesca; Gallo, Angela; Farace, Maria Giulia; Michienzi, Alessandro

    2009-01-01

    Adenosine deaminases that act on dsRNA (ADARs) are enzymes that target double-stranded regions of RNA converting adenosines into inosines (A-to-I editing) thus contributing to genome complexity and fine regulation of gene expression. It has been described that a member of the ADAR family, ADAR1, can target viruses and affect their replication process. Here we report evidence showing that ADAR1 stimulates human immuno deficiency virus type 1 (HIV-1) replication by using both editing-dependent and editing-independent mechanisms. We show that over-expression of ADAR1 in HIV-1 producer cells increases viral protein accumulation in an editing-independent manner. Moreover, HIV-1 virions generated in the presence of over-expressed ADAR1 but not an editing-inactive ADAR1 mutant are released more efficiently and display enhanced infectivity, as demonstrated by challenge assays performed with T cell lines and primary CD4+ T lymphocytes. Finally, we report that ADAR1 associates with HIV-1 RNAs and edits adenosines in the 5′ untranslated region (UTR) and the Rev and Tat coding sequence. Overall these results suggest that HIV-1 has evolved mechanisms to take advantage of specific RNA editing activity of the host cell and disclose a stimulatory function of ADAR1 in the spread of HIV-1. PMID:19651874

  10. Comparison of methods for genomic localization of gene trap sequences

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  11. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  12. The diploid genome sequence of an individual human.

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  13. Draft genome sequence of ramie, Boehmeria nivea (L.) Gaudich.

    Luan, Ming-Bao; Jian, Jian-Bo; Chen, Ping; Chen, Jun-Hui; Chen, Jian-Hua; Gao, Qiang; Gao, Gang; Zhou, Ju-Hong; Chen, Kun-Mei; Guang, Xuan-Min; Chen, Ji-Kang; Zhang, Qian-Qian; Wang, Xiao-Fei; Fang, Long; Sun, Zhi-Min; Bai, Ming-Zhou; Fang, Xiao-Dong; Zhao, Shan-Cen; Xiong, He-Ping; Yu, Chun-Ming; Zhu, Ai-Guo

    2018-05-01

    Ramie, Boehmeria nivea (L.) Gaudich, family Urticaceae, is a plant native to eastern Asia, and one of the world's oldest fibre crops. It is also used as animal feed and for the phytoremediation of heavy metal-contaminated farmlands. Thus, the genome sequence of ramie was determined to explore the molecular basis of its fibre quality, protein content and phytoremediation. For further understanding ramie genome, different paired-end and mate-pair libraries were combined to generate 134.31 Gb of raw DNA sequences using the Illumina whole-genome shotgun sequencing approach. The highly heterozygous B. nivea genome was assembled using the Platanus Genome Assembler, which is an effective tool for the assembly of highly heterozygous genome sequences. The final length of the draft genome of this species was approximately 341.9 Mb (contig N50 = 22.62 kb, scaffold N50 = 1,126.36 kb). Based on ramie genome annotations, 30,237 protein-coding genes were predicted, and the repetitive element content was 46.3%. The completeness of the final assembly was evaluated by benchmarking universal single-copy orthologous genes (BUSCO); 90.5% of the 1,440 expected embryophytic genes were identified as complete, and 4.9% were identified as fragmented. Phylogenetic analysis based on single-copy gene families and one-to-one orthologous genes placed ramie with mulberry and cannabis, within the clade of urticalean rosids. Genome information of ramie will be a valuable resource for the conservation of endangered Boehmeria species and for future studies on the biogeography and characteristic evolution of members of Urticaceae. © 2018 John Wiley & Sons Ltd.

  14. Identification of full-length transmitted/founder viruses and their progeny in primary HIV-1 infection

    Korber, Bette [Los Alamos National Laboratory; Hraber, Peter [Los Alamos National Laboratory; Giorgi, Elena [Los Alamos National Laboratory; Bhattacharya, T [Los Alamos National Laboratory

    2009-01-01

    Identification of transmitted/founder virus genomes and their progeny by is a novel strategy for probing the molecular basis of HIV-1 transmission and for evaluating the genetic imprint of viral and host factors that act to constrain or facilitate virus replication. Here, we show in a cohort of twelve acutely infected subjects (9 clade B; 3 clade C), that complete genomic sequences of transmitted/founder viruses could be inferred using single genome amplification of plasma viral RNA, direct amplicon sequencing, and a model of random virus evolution. This allowed for the precise identification, chemical synthesis, molecular cloning, and biological analysis of those viruses actually responsible for productive clinical infection and for a comprehensive mapping of sequential viral genomes and proteomes for mutations that are necessary or incidental to the establishment of HIV-1 persistence. Transmitted/founder viruses were CD4 and CCR5 tropic, replicated preferentially in activated primary T-Iymphocytes but not monocyte-derived macrophages, and were effectively shielded from most heterologous or broadly neutralizing antibodies. By 3 months of infection, the evolving viral quasispecies in three subjects showed mutational fixation at only 2-5 discreet genomic loci. By 6-12 months, mutational fixation was evident at 18-27 genomic loci. Some, but not all, of these mutations were attributable to virus escape from cytotoxic Tlymphocytes or neutralizing antibodies, suggesting that other viral or host factors may influence early HIV -1 fitness.

  15. Deciphering the distance to antibiotic resistance for the pneumococcus using genome sequencing data

    Mobegi, Fredrick M; Cremers, Amelieke J H; de Jonge, Marien I; Bentley, Stephen D; van Hijum, Sacha A F T; Zomer, Aldert|info:eu-repo/dai/nl/304642754

    2017-01-01

    Advances in genome sequencing technologies and genome-wide association studies (GWAS) have provided unprecedented insights into the molecular basis of microbial phenotypes and enabled the identification of the underlying genetic variants in real populations. However, utilization of genome sequencing

  16. Draft genome sequences of two virulent serotypes of avian Pasteurella multocida

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent Pasteurella multocida strain Pm70....

  17. Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida

    Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.; Briggs, Robert E.

    2013-01-01

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P.?multocida strain Pm70.

  18. Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida

    Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.

    2013-01-01

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70. PMID:23405337

  19. Covariance of charged amino acids at positions 322 and 440 of HIV-1 Env contributes to coreceptor specificity of subtype B viruses, and can be used to improve the performance of V3 sequence-based coreceptor usage prediction algorithms.

    Kieran Cashin

    Full Text Available The ability to determine coreceptor usage of patient-derived human immunodeficiency virus type 1 (HIV-1 strains is clinically important, particularly for the administration of the CCR5 antagonist maraviroc. The envelope glycoprotein (Env determinants of coreceptor specificity lie primarily within the gp120 V3 loop region, although other Env determinants have been shown to influence gp120-coreceptor interactions. Here, we determined whether conserved amino acid alterations outside the V3 loop that contribute to coreceptor usage exist, and whether these alterations improve the performance of V3 sequence-based coreceptor usage prediction algorithms. We demonstrate a significant covariant association between charged amino acids at position 322 in V3 and position 440 in the C4 Env region that contributes to the specificity of HIV-1 subtype B strains for CCR5 or CXCR4. Specifically, positively charged Lys/Arg at position 322 and negatively charged Asp/Glu at position 440 occurred more frequently in CXCR4-using viruses, whereas negatively charged Asp/Glu at position 322 and positively charged Arg at position 440 occurred more frequently in R5 strains. In the context of CD4-bound gp120, structural models suggest that covariation of amino acids at Env positions 322 and 440 has the potential to alter electrostatic interactions that are formed between gp120 and charged amino acids in the CCR5 N-terminus. We further demonstrate that inclusion of a "440 rule" can improve the sensitivity of several V3 sequence-based genotypic algorithms for predicting coreceptor usage of subtype B HIV-1 strains, without compromising specificity, and significantly improves the AUROC of the geno2pheno algorithm when set to its recommended false positive rate of 5.75%. Together, our results provide further mechanistic insights into the intra-molecular interactions within Env that contribute to coreceptor specificity of subtype B HIV-1 strains, and demonstrate that incorporation

  20. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

    2009-01-01

    Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536

  1. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Carmen Yea

    2009-06-01

    Full Text Available Although the human parainfluenza virus 4 (HPIV4 has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada. The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97% with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized.

  2. Mitochondrial genome sequencing helps show the evolutionary mechanism of mitochondrial genome formation in Brassica

    2011-01-01

    Background Angiosperm mitochondrial genomes are more complex than those of other organisms. Analyses of the mitochondrial genome sequences of at least 11 angiosperm species have showed several common properties; these cannot easily explain, however, how the diverse mitotypes evolved within each genus or species. We analyzed the evolutionary relationships of Brassica mitotypes by sequencing. Results We sequenced the mitotypes of cam (Brassica rapa), ole (B. oleracea), jun (B. juncea), and car (B. carinata) and analyzed them together with two previously sequenced mitotypes of B. napus (pol and nap). The sizes of whole single circular genomes of cam, jun, ole, and car are 219,747 bp, 219,766 bp, 360,271 bp, and 232,241 bp, respectively. The mitochondrial genome of ole is largest as a resulting of the duplication of a 141.8 kb segment. The jun mitotype is the result of an inherited cam mitotype, and pol is also derived from the cam mitotype with evolutionary modifications. Genes with known functions are conserved in all mitotypes, but clear variation in open reading frames (ORFs) with unknown functions among the six mitotypes was observed. Sequence relationship analysis showed that there has been genome compaction and inheritance in the course of Brassica mitotype evolution. Conclusions We have sequenced four Brassica mitotypes, compared six Brassica mitotypes and suggested a mechanism for mitochondrial genome formation in Brassica, including evolutionary events such as inheritance, duplication, rearrangement, genome compaction, and mutation. PMID:21988783

  3. High Multiplicity Infection by HIV-1 in Men Who Have Sex with Men.

    Hui Li

    2010-05-01

    Full Text Available Elucidating virus-host interactions responsible for HIV-1 transmission is important for advancing HIV-1 prevention strategies. To this end, single genome amplification (SGA and sequencing of HIV-1 within the context of a model of random virus evolution has made possible for the first time an unambiguous identification of transmitted/founder viruses and a precise estimation of their numbers. Here, we applied this approach to HIV-1 env analyses in a cohort of acutely infected men who have sex with men (MSM and found that a high proportion (10 of 28; 36% had been productively infected by more than one virus. In subjects with multivariant transmission, the minimum number of transmitted viruses ranged from 2 to 10 with viral recombination leading to rapid and extensive genetic shuffling among virus lineages. A combined analysis of these results, together with recently published findings based on identical SGA methods in largely heterosexual (HSX cohorts, revealed a significantly higher frequency of multivariant transmission in MSM than in HSX [19 of 50 subjects (38% versus 34 of 175 subjects (19%; Fisher's exact p = 0.008]. To further evaluate the SGA strategy for identifying transmitted/founder viruses, we analyzed 239 overlapping 5' and 3' half genome or env-only sequences from plasma viral RNA (vRNA and blood mononuclear cell DNA in an MSM subject who had a particularly well-documented virus exposure history 3-6 days before symptom onset and 14-17 days before peak plasma viremia (47,600,000 vRNA molecules/ml. All 239 sequences coalesced to a single transmitted/founder virus genome in a time frame consistent with the clinical history, and a molecular clone of this genome encoded replication competent virus in accord with model predictions. Higher multiplicity of HIV-1 infection in MSM compared with HSX is consistent with the demonstrably higher epidemiological risk of virus acquisition in MSM and could indicate a greater challenge for HIV-1

  4. Complete genome sequence of Nakamurella multipartita type strain (Y-104).

    Tice, Hope; Mayilraj, Shanmugam; Sims, David; Lapidus, Alla; Nolan, Matt; Lucas, Susan; Glavina Del Rio, Tijana; Copeland, Alex; Cheng, Jan-Fang; Meincke, Linda; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Detter, John C; Brettin, Thomas; Rohde, Manfred; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Chen, Feng

    2010-03-30

    Nakamurella multipartita (Yoshimi et al. 1996) Tao et al. 2004 is the type species of the monospecific genus Nakamurella in the actinobacterial suborder Frankineae. The nonmotile, coccus-shaped strain was isolated from activated sludge acclimated with sugar-containing synthetic wastewater, and is capable of accumulating large amounts of polysaccharides in its cells. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of a member of the family Nakamurellaceae. The 6,060,298 bp long single replicon genome with its 5415 protein-coding and 56 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Whole genome sequencing in clinical and public health microbiology.

    Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

    2015-04-01

    Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure.

  6. Complete genome sequence of Truepera radiovictrix type strain (RQ-24).

    Ivanova, Natalia; Rohde, Christine; Munk, Christine; Nolan, Matt; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Deshpande, Shweta; Cheng, Jan-Fang; Tapia, Roxane; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Brambilla, Evelyne; Rohde, Manfred; Göker, Markus; Tindall, Brian J; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

    2011-02-22

    Truepera radiovictrix Albuquerque et al. 2005 is the type species of the genus Truepera within the phylum "Deinococcus/Thermus". T. radiovictrix is of special interest not only because of its isolated phylogenetic location in the order Deinococcales, but also because of its ability to grow under multiple extreme conditions in alkaline, moderately saline, and high temperature habitats. Of particular interest is the fact that, T. radiovictrix is also remarkably resistant to ionizing radiation, a feature it shares with members of the genus Deinococcus. This is the first completed genome sequence of a member of the family Trueperaceae and the fourth type strain genome sequence from a member of the order Deinococcales. The 3,260,398 bp long genome with its 2,994 protein-coding and 52 RNA genes consists of one circular chromosome and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  7. Draft genome sequence of the rubber tree Hevea brasiliensis

    Rahman Ahmad Yamin Abdul

    2013-02-01

    Full Text Available Abstract Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR. NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,955 gene models, of which 12.7% are unique to Hevea. Most of the key genes associated with rubber biosynthesis, rubberwood formation, disease resistance, and allergenicity have been identified. Conclusions The knowledge gained from this genome sequence will aid in the future development of high-yielding clones to keep up with the ever increasing need for natural rubber.

  8. Understanding Cancer Genome and Its Evolution by Next Generation Sequencing

    Hou, Yong

    Cancer will cause 13 million deaths by the year of 2030, ranking the second leading cause of death worldwide. Previous studies indicate that most of the cancers originate from cells that acquired somatic mutations and evolved as Darwin Theory. Ten biological insights of cancer have been summarized...... recently. Cutting-age technologies like next generation sequencing (NGS) enable exploring cancer genome and evolution much more efficiently. However, integrated cancer genome sequencing studies showed great inter-/intra-tumoral heterogeneity (ITH) and complex evolution patterns beyond the cancer biological...... knowledge we previously know. There is very limited knowledge of East Asia lung cancer genome except enrichment of EGFR mutations and lack of KRAS mutations. We carried out integrated genomic, transcriptomic and methylomic analysis of 335 primary Chinese lung adenocarcinomas (LUAD) and 35 corresponding...

  9. Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea.

    Yuan, Jianbo; Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2017-07-05

    Crustacea, particularly Decapoda, contains many economically important species, such as shrimps and crabs. Crustaceans exhibit enormous (nearly 500-fold) variability in genome size. However, limited genome resources are available for investigating these species. Exopalaemon carinicauda Holthuis, an economical caridean shrimp, is a potential ideal experimental animal for research on crustaceans. In this study, we performed low-coverage sequencing and de novo assembly of the E. carinicauda genome. The assembly covers more than 95% of coding regions. E. carinicauda possesses a large complex genome (5.73 Gb), with size twice higher than those of many decapod shrimps. As such, comparative genomic analyses were implied to investigate factors affecting genome size evolution of decapods. However, clues associated with genome duplication were not identified, and few horizontally transferred sequences were detected. Ultimately, the burst of transposable elements, especially retrotransposons, was determined as the major factor influencing genome expansion. A total of 2 Gb repeats were identified, and RTE-BovB, Jockey, Gypsy, and DIRS were the four major retrotransposons that significantly expanded. Both recent (Jockey and Gypsy) and ancestral (DIRS) originated retrotransposons responsible for the genome evolution. The E. carinicauda genome also exhibited potential for the genomic and experimental research of shrimps.

  10. Draft Genome Sequences of Actinobacillus pleuropneumoniae Serotypes 2 and 6

    Zhan, Bujie; Angen, Øystein; Hedegaard, Jakob

    2010-01-01

    Actinobacillus pleuropneumoniae is a bacterial pathogen that causes highly contagious respiratory infection in pigs and has a serious impact on the production economy and animal welfare. As clear differences in virulence between serotypes have been observed, the genetic basis should be investigat...... at the genomic level. Here, we present the draft genome sequences of the A. pleuropneumoniae serotypes 2 (strain 4226) and 6 (strain Femo)....

  11. Complete Genome Sequence of Pseudomonas aeruginosa Phage AAT-1.

    Andrade-Domínguez, Andrés; Kolter, Roberto

    2016-08-25

    Aspects of the interaction between phages and animals are of interest and importance for medical applications. Here, we report the genome sequence of the lytic Pseudomonas phage AAT-1, isolated from mammalian serum. AAT-1 is a double-stranded DNA phage, with a genome of 57,599 bp, containing 76 predicted open reading frames. Copyright © 2016 Andrade-Domínguez and Kolter.

  12. Draft Genome Sequence of Escherichia coli K-12 (ATCC 10798).

    Dimitrova, Daniela; Engelbrecht, Kathleen C; Putonti, Catherine; Koenig, David W; Wolfe, Alan J

    2017-07-06

    Here, we present the draft genome sequence of Escherichia coli ATCC 10798. E. coli ATCC 10798 is a K-12 strain, one of the most well-studied model microorganisms. The size of the genome was 4,685,496 bp, with a G+C content of 50.70%. This assembly consists of 62 contigs and the F plasmid. Copyright © 2017 Dimitrova et al.

  13. Epigenetics of obesity: beyond the genome sequence.

    Cordero, Paul; Li, Jiawei; Oben, Jude A

    2015-07-01

    After the study of the gene code as a trigger for obesity, epigenetic code has appeared as a novel tool in the diagnosis, prognosis and treatment of obesity, and its related comorbidities. This review summarizes the status of the epigenetic field associated with obesity, and the current epigenetic-based approaches for obesity treatment. Thanks to technical advances, novel and key obesity-associated polymorphisms have been described by genome-wide association studies, but there are limitations with their predictive power. Epigenetics is also studied for disease association, which involves decoding of the genome information, transcriptional status and later phenotypes. Obesity could be induced during adult life by feeding and other environmental factors, and there is a strong association between obesity features and specific epigenetic patterns. These patterns could be established during early life stages, and programme the risk of obesity and its comorbidities during adult life. Furthermore, recent studies have shown that DNA methylation profile could be applied as biomarkers of diet-induced weight loss treatment. High-throughput technologies, recently implemented for commercial genetic test panels, could soon lead to the creation of epigenetic test panels for obesity. Nonetheless, epigenetics is a modifiable risk factor, and different dietary patterns or environmental insights during distinct stages of life could lead to rewriting of the epigenetic profile.

  14. Predictive genomics: A cancer hallmark network framework for predicting tumor clinical phenotypes using genome sequencing data

    Wang, Edwin; Zaman, Naif; Mcgee, Shauna; Milanese, Jean-Sébastien; Masoudi-Nejad, Ali; O'Connor, Maureen

    2014-01-01

    We discuss a cancer hallmark network framework for modelling genome-sequencing data to predict cancer clonal evolution and associated clinical phenotypes. Strategies of using this framework in conjunction with genome sequencing data in an attempt to predict personalized drug targets, drug resistance, and metastasis for a cancer patient, as well as cancer risks for a healthy individual are discussed. Accurate prediction of cancer clonal evolution and clinical phenotypes will have substantial i...

  15. Order and correlations in genomic DNA sequences. The spectral approach

    Lobzin, Vasilii V; Chechetkin, Vladimir R

    2000-01-01

    The structural analysis of genomic DNA sequences is discussed in the framework of the spectral approach, which is sufficiently universal due to the reciprocal correspondence and mutual complementarity of Fourier transform length scales. The spectral characteristics of random sequences of the same nucleotide composition possess the property of self-averaging for relatively short sequences of length M≥100-300. Comparison with the characteristics of random sequences determines the statistical significance of the structural features observed. Apart from traditional applications to the search for hidden periodicities, spectral methods are also efficient in studying mutual correlations in DNA sequences. By combining spectra for structure factors and correlation functions, not only integral correlations can be estimated but also their origin identified. Using the structural spectral entropy approach, the regularity of a sequence can be quantitatively assessed. A brief introduction to the problem is also presented and other major methods of DNA sequence analysis described. (reviews of topical problems)

  16. Establishing a framework for comparative analysis of genome sequences

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  17. Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing.

    Margaret Staton

    Full Text Available Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence.

  18. Low-pass sequencing for microbial comparative genomics

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  19. The minimum information about a genome sequence (MIGS) specification

    Field, Dawn; Garrity, George; Gray, Tanya; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J; Angiuoli, Samuel V; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; dePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E; Li, Kelvin; Lister, Allyson L; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; Gil, Ingio San; Wilson, Gareth; Wipat, Anil

    2008-01-01

    With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases. PMID:18464787

  20. The minimum information about a genome sequence (MIGS) specification

    Field, D; Garrity, G; Gray, T

    2008-01-01

    With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the...... that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases....... the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources...

  1. An integrated semiconductor device enabling non-optical genome sequencing.

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-20

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.

  2. Unveiling Mycoplasma hyopneumoniae Promoters: Sequence Definition and Genomic Distribution

    Weber, Shana de Souto; Sant'Anna, Fernando Hayashi; Schrank, Irene Silveira

    2012-01-01

    Several Mycoplasma species have had their genome completely sequenced, including four strains of the swine pathogen Mycoplasma hyopneumoniae. Nevertheless, little is known about the nucleotide sequences that control transcriptional initiation in these microorganisms. Therefore, with the objective of investigating the promoter sequences of M. hyopneumoniae, 23 transcriptional start sites (TSSs) of distinct genes were mapped. A pattern that resembles the σ70 promoter −10 element was found upstream of the TSSs. However, no −35 element was distinguished. Instead, an AT-rich periodic signal was identified. About half of the experimentally defined promoters contained the motif 5′-TRTGn-3′, which was identical to the −16 element usually found in Gram-positive bacteria. The defined promoters were utilized to build position-specific scoring matrices in order to scan putative promoters upstream of all coding sequences (CDSs) in the M. hyopneumoniae genome. Two hundred and one signals were found associated with 169 CDSs. Most of these sequences were located within 100 nucleotides of the start codons. This study has shown that the number of promoter-like sequences in the M. hyopneumoniae genome is more frequent than expected by chance, indicating that most of the sequences detected are probably biologically functional. PMID:22334569

  3. The complete chloroplast genome sequence of Curcuma flaviflora (Curcuma).

    Zhang, Yan; Deng, Jiabin; Li, Yangyi; Gao, Gang; Ding, Chunbang; Zhang, Li; Zhou, Yonghong; Yang, Ruiwu

    2016-09-01

    The complete chloroplast (cp) genome of Curcuma flaviflora, a medicinal plant in Southeast Asia, was sequenced. The genome size was 160 478 bp in length, with 36.3% GC content. A pair of inverted repeats (IRs) of 26 946 bp were separated by a large single copy (LSC) of 88 008 bp and a small single copy (SSC) of 18 578 bp, respectively. The cp genome contained 132 annotated genes, including 79 protein coding genes, 30 tRNA genes, and four rRNA genes. And 19 of these genes were duplicated in inverted repeat regions.

  4. Mapping copy number variation by population-scale genome sequencing

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...

  5. Complete Genome Sequence of Escherichia coli Strain WG5

    Imamovic, Lejla; Misiakou, Maria-Anna; van der Helm, Eric

    2018-01-01

    Escherichia coli strain WG5 is a widely used host for phage detection, including somatic coliphages employed as standard ISO method 10705-1 (2000). Here, we present the complete genome sequence of a commercial E. coli WG5 strain.......Escherichia coli strain WG5 is a widely used host for phage detection, including somatic coliphages employed as standard ISO method 10705-1 (2000). Here, we present the complete genome sequence of a commercial E. coli WG5 strain....

  6. Complete genome sequence of the European sheatfish virus.

    Mavian, Carla; López-Bueno, Alberto; Fernández Somalo, María Pilar; Alcamí, Antonio; Alejo, Alí

    2012-06-01

    Viral diseases are an increasing threat to the thriving aquaculture industry worldwide. An emerging group of fish pathogens is formed by several ranaviruses, which have been isolated at different locations from freshwater and seawater fish species since 1985. We report the complete genome sequence of European sheatfish ranavirus (ESV), the first ranavirus isolated in Europe, which causes high mortality rates in infected sheatfish (Silurus glanis) and in other species. Analysis of the genome sequence shows that ESV belongs to the amphibian-like ranaviruses and is closely related to the epizootic hematopoietic necrosis virus (EHNV), a disease agent geographically confined to the Australian continent and notifiable to the World Organization for Animal Health.

  7. Sequence modelling and an extensible data model for genomic database

    Li, Peter Wei-Der [California Univ., San Francisco, CA (United States); Univ. of California, Berkeley, CA (United States)

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  8. Sequence modelling and an extensible data model for genomic database

    Li, Peter Wei-Der (California Univ., San Francisco, CA (United States) Lawrence Berkeley Lab., CA (United States))

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  9. Minimal Contribution of APOBEC3-Induced G-to-A Hypermutation to HIV-1 Recombination and Genetic Variation.

    Delviks-Frankenberry, Krista A; Nikolaitchik, Olga A; Burdick, Ryan C; Gorelick, Robert J; Keele, Brandon F; Hu, Wei-Shau; Pathak, Vinay K

    2016-05-01

    Although the predominant effect of host restriction APOBEC3 proteins on HIV-1 infection is to block viral replication, they might inadvertently increase retroviral genetic variation by inducing G-to-A hypermutation. Numerous studies have disagreed on the contribution of hypermutation to viral genetic diversity and evolution. Confounding factors contributing to the debate include the extent of lethal (stop codon) and sublethal hypermutation induced by different APOBEC3 proteins, the inability to distinguish between G-to-A mutations induced by APOBEC3 proteins and error-prone viral replication, the potential impact of hypermutation on the frequency of retroviral recombination, and the extent to which viral recombination occurs in vivo, which can reassort mutations in hypermutated genomes. Here, we determined the effects of hypermutation on the HIV-1 recombination rate and its contribution to genetic variation through recombination to generate progeny genomes containing portions of hypermutated genomes without lethal mutations. We found that hypermutation did not significantly affect the rate of recombination, and recombination between hypermutated and wild-type genomes only increased the viral mutation rate by 3.9 × 10-5 mutations/bp/replication cycle in heterozygous virions, which is similar to the HIV-1 mutation rate. Since copackaging of hypermutated and wild-type genomes occurs very rarely in vivo, recombination between hypermutated and wild-type genomes does not significantly contribute to the genetic variation of replicating HIV-1. We also analyzed previously reported hypermutated sequences from infected patients and determined that the frequency of sublethal mutagenesis for A3G and A3F is negligible (4 × 10-21 and1 × 10-11, respectively) and its contribution to viral mutations is far below mutations generated during error-prone reverse transcription. Taken together, we conclude that the contribution of APOBEC3-induced hypermutation to HIV-1 genetic

  10. Correction for Measurement Error from Genotyping-by-Sequencing in Genomic Variance and Genomic Prediction Models

    Ashraf, Bilal; Janss, Luc; Jensen, Just

    sample). The GBSeq data can be used directly in genomic models in the form of individual SNP allele-frequency estimates (e.g., reference reads/total reads per polymorphic site per individual), but is subject to measurement error due to the low sequencing depth per individual. Due to technical reasons....... In the current work we show how the correction for measurement error in GBSeq can also be applied in whole genome genomic variance and genomic prediction models. Bayesian whole-genome random regression models are proposed to allow implementation of large-scale SNP-based models with a per-SNP correction...... for measurement error. We show correct retrieval of genomic explained variance, and improved genomic prediction when accounting for the measurement error in GBSeq data...

  11. Chemical rationale for selection of isolates for genome sequencing

    Rank, Christian; Larsen, Thomas Ostenfeld; Frisvad, Jens Christian

    The advances in gene sequencing will in the near future enable researchers to affordably acquire the full genomes of handpicked isolates. We here present a method to evaluate the chemical potential of an entire species and select representatives for genome sequencing. The selection criteria for new...... strains to be sequenced can be manifold, but for studying the functional phenotype, using a metabolome based approach offers a cheap and rapid assessment of critical strains to cover the chemical diversity. We have applied this methodology on the complex A. flavus/A. oryzae group. Though these two species...... are in principal identical, they represent two different phenotypes. This is clearly presented through a correspondence analysis of selected extrolites, in which the subtle chemical differences are visually dispersed. The results points to a handful of strains, which, if sequenced, will likely enhance our...

  12. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  13. [Molecular epidemiological analysis of HIV-1 variants circulating in Russia in 1987-2015].

    Lapovok, I A; Lopatukhin, A E; Kireev, D E; Kazennova, E V; Lebedev, A V; Bobkova, M R; Kolomeets, A N; Turbina, G I; Shipulin, G A; Ladnaya, N N; Pokrovsky, V V

    To simultaneously analyze HIV-1 samples from all Russian regions to characterize the epidemiology of HIV infection in the country as a whole. The most extensive study was conducted to examine nucleotide sequences of the pol gene of HIV-1 samples isolated from HIV-positive persons in different regions of Russia, with the diagnosis date being fixed during 1987-2015. The nucleotide sequences of the HIV-1 genome were analyzed using computer programs and on-line applications to identify a virus subtype and new recombinant forms. The nucleotide sequences of the pol gene were analyzed in 1697 HIV-1 samples and the findings were that the genetic variant subtype A1 (IDU-A) was dominant throughout the entire territory of Russia (in more than 80% of all infection cases). Other virus variants circulating in Russia were analyzed; the phenomenon of the higher distribution of the recombinant form CRF63/02A in Siberia, which had been previously described in the literature, was also confirmed. Four new recombinant forms generated by the virus subtype A1 (IDU-A) and B and two AG recombinant forms were found. There was a larger genetic distance between the viruses of IDU-A variant circulating among the injecting drug users and those infected through heterosexual contact, as well as a change in the viruses of subtype G that caused the outbreak in the south of the country over time in 1988-1989. The findings demonstrate continuous HIV-1 genetic variability and recombination over time in Russia, as well as increased genetic diversity with higher HIV infection rates in the population.

  14. Building a model: developing genomic resources for common milkweed (Asclepias syriaca with low coverage genome sequencing

    Weitemier Kevin

    2011-05-01

    Full Text Available Abstract Background Milkweeds (Asclepias L. have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L. could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp and 5S rDNA (120 bp sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp, with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae unigenes (median coverage of 0.29× and 66% of single copy orthologs (COSII in asterids (median coverage of 0.14×. From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites and phylogenetics (low-copy nuclear genes studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species

  15. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

    2011-05-04

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first

  16. Generation and Characterization of a Defective HIV-1 Virus as an Immunogen for a Therapeutic Vaccine

    García-Pérez, Javier; García, Felipe; Blanco, Julia; Escribà-García, Laura; Gatell, Jose Maria; Alcamí, Jose; Plana, Montserrat; Sánchez-Palomino, Sonsoles

    2012-01-01

    Background The generation of new immunogens able to elicit strong specific immune responses remains a major challenge in the attempts to obtain a prophylactic or therapeutic vaccine against HIV/AIDS. We designed and constructed a defective recombinant virus based on the HIV-1 genome generating infective but non-replicative virions able to elicit broad and strong cellular immune responses in HIV-1 seropositive individuals. Results Viral particles were generated through transient transfection in producer cells (293-T) of a full length HIV-1 DNA carrying a deletion of 892 base pairs (bp) in the pol gene encompassing the sequence that codes for the reverse transcriptase (NL4-3/ΔRT clone). The viral particles generated were able to enter target cells, but due to the absence of reverse transcriptase no replication was detected. The immunogenic capacity of these particles was assessed by ELISPOT to determine γ-interferon production in a cohort of 69 chronic asymptomatic HIV-1 seropositive individuals. Surprisingly, defective particles produced from NL4-3/ΔRT triggered stronger cellular responses than wild-type HIV-1 viruses inactivated with Aldrithiol-2 (AT-2) and in a larger proportion of individuals (55% versus 23% seropositive individuals tested). Electron microscopy showed that NL4-3/ΔRT virions display immature morphology. Interestingly, wild-type viruses treated with Amprenavir (APV) to induce defective core maturation also induced stronger responses than the same viral particles generated in the absence of protease inhibitors. Conclusions We propose that immature HIV-1 virions generated from NL4-3/ΔRT viral clones may represent new prototypes of immunogens with a safer profile and stronger capacity to induce cellular immune responses than wild-type inactivated viral particles. PMID:23144996

  17. Interactions between the HIV-1 Unspliced mRNA and Host mRNA Decay Machineries

    Daniela Toro-Ascuy

    2016-11-01

    Full Text Available The human immunodeficiency virus type-1 (HIV-1 unspliced transcript is used both as mRNA for the synthesis of structural proteins and as the packaged genome. Given the presence of retained introns and instability AU-rich sequences, this viral transcript is normally retained and degraded in the nucleus of host cells unless the viral protein REV is present. As such, the stability of the HIV-1 unspliced mRNA must be particularly controlled in the nucleus and the cytoplasm in order to ensure proper levels of this viral mRNA for translation and viral particle formation. During its journey, the HIV-1 unspliced mRNA assembles into highly specific messenger ribonucleoproteins (mRNPs containing many different host proteins, amongst which are well-known regulators of cytoplasmic mRNA decay pathways such as up-frameshift suppressor 1 homolog (UPF1, Staufen double-stranded RNA binding protein 1/2 (STAU1/2, or components of miRNA-induced silencing complex (miRISC and processing bodies (PBs. More recently, the HIV-1 unspliced mRNA was shown to contain N6-methyladenosine (m6A, allowing the recruitment of YTH N6-methyladenosine RNA binding protein 2 (YTHDF2, an m6A reader host protein involved in mRNA decay. Interestingly, these host proteins involved in mRNA decay were shown to play positive roles in viral gene expression and viral particle assembly, suggesting that HIV-1 interacts with mRNA decay components to successfully accomplish viral replication. This review summarizes the state of the art in terms of the interactions between HIV-1 unspliced mRNA and components of different host mRNA decay machineries.

  18. Cloaked similarity between HIV-1 and SARS-CoV suggests an anti-SARS strategy

    Kliger Yossef

    2003-09-01

    Full Text Available Abstract Background Severe acute respiratory syndrome (SARS is a febrile respiratory illness. The disease has been etiologically linked to a novel coronavirus that has been named the SARS-associated coronavirus (SARS-CoV, whose genome was recently sequenced. Since it is a member of the Coronaviridae, its spike protein (S2 is believed to play a central role in viral entry by facilitating fusion between the viral and host cell membranes. The protein responsible for viral-induced membrane fusion of HIV-1 (gp41 differs in length, and has no sequence homology with S2. Results Sequence analysis reveals that the two viral proteins share the sequence motifs that construct their active conformation. These include (1 an N-terminal leucine/isoleucine zipper-like sequence, and (2 a C-terminal heptad repeat located upstream of (3 an aromatic residue-rich region juxtaposed to the (4 transmembrane segment. Conclusions This study points to a similar mode of action for the two viral proteins, suggesting that anti-viral strategy that targets the viral-induced membrane fusion step can be adopted from HIV-1 to SARS-CoV. Recently the FDA approved Enfuvirtide, a synthetic peptide corresponding to the C-terminal heptad repeat of HIV-1 gp41, as an anti-AIDS agent. Enfuvirtide and C34, another anti HIV-1 peptide, exert their inhibitory activity by binding to a leucine/isoleucine zipper-like sequence in gp41, thus inhibiting a conformational change of gp41 required for its activation. We suggest that peptides corresponding to the C-terminal heptad repeat of the S2 protein may serve as inhibitors for SARS-CoV entry.

  19. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.; Boore,Jeffrey L.

    2007-01-01

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae, respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.

  20. Genome sequence of the lager brewing yeast, an interspecies hybrid.

    Nakao, Yoshihiro; Kanamori, Takeshi; Itoh, Takehiko; Kodama, Yukiko; Rainieri, Sandra; Nakamura, Norihisa; Shimonaga, Tomoko; Hattori, Masahira; Ashikari, Toshihiko

    2009-04-01

    This work presents the genome sequencing of the lager brewing yeast (Saccharomyces pastorianus) Weihenstephan 34/70, a strain widely used in lager beer brewing. The 25 Mb genome comprises two nuclear sub-genomes originating from Saccharomyces cerevisiae and Saccharomyces bayanus and one circular mitochondrial genome originating from S. bayanus. Thirty-six different types of chromosomes were found including eight chromosomes with translocations between the two sub-genomes, whose breakpoints are within the orthologous open reading frames. Several gene loci responsible for typical lager brewing yeast characteristics such as maltotriose uptake and sulfite production have been increased in number by chromosomal rearrangements. Despite an overall high degree of conservation of the synteny with S. cerevisiae and S. bayanus, the syntenies were not well conserved in the sub-telomeric regions that contain lager brewing yeast characteristic and specific genes. Deletion of larger chromosomal regions, a massive unilateral decrease of the ribosomal DNA cluster and bilateral truncations of over 60 genes reflect a post-hybridization evolution process. Truncations and deletions of less efficient maltose and maltotriose uptake genes may indicate the result of adaptation to brewing. The genome sequence of this interspecies hybrid yeast provides a new tool for better understanding of lager brewing yeast behavior in industrial beer production.

  1. Genome sequence of a urease-positive Campylobacter lari strain

    Campylobacter lari is frequently isolated from shore birds and can cause illness in humans. Here we report the draft whole genome sequence of an urease-positive strain of C. lari that was isolated in estuarial water on the coast of Delaware, USA....

  2. Complete Genome Sequence of Beijerinckia indica subsp. indica▿

    Tamas, Ivica; Dedysh, Svetlana N.; Liesack, Werner; Stott, Matthew B.; Alam, Maqsudul; Murrell, J. Colin; Dunfield, Peter F.

    2010-01-01

    Beijerinckia indica subsp. indica is an aerobic, acidophilic, exopolysaccharide-producing, N2-fixing soil bacterium. It is a generalist chemoorganotroph that is phylogenetically closely related to facultative and obligate methanotrophs of the genera Methylocella and Methylocapsa. Here we report the full genome sequence of this bacterium. PMID:20601475

  3. Complete genome sequence of Beijerinckia indica subsp. indica.

    Tamas, Ivica; Dedysh, Svetlana N; Liesack, Werner; Stott, Matthew B; Alam, Maqsudul; Murrell, J Colin; Dunfield, Peter F

    2010-09-01

    Beijerinckia indica subsp. indica is an aerobic, acidophilic, exopolysaccharide-producing, N(2)-fixing soil bacterium. It is a generalist chemoorganotroph that is phylogenetically closely related to facultative and obligate methanotrophs of the genera Methylocella and Methylocapsa. Here we report the full genome sequence of this bacterium.

  4. Draft Genome Sequence of Corynebacterium kefirresidentii SB, Isolated from Kefir.

    Blasche, Sonja; Kim, Yongkyu; Patil, Kiran R

    2017-09-14

    The genus Corynebacterium includes Gram-positive species with a high G+C content. We report here a novel species, Corynebacterium kefirresidentii SB, isolated from kefir grains collected in Germany. Its draft genome sequence was remarkably dissimilar (average nucleotide identity, 76.54%) to those of other Corynebacterium spp., confirming that this is a unique novel species. Copyright © 2017 Blasche et al.

  5. Genome Sequence of Gordonia Phage BetterKatz

    Berryman, Emily N.; Forrest, Kaitlyn M.; McHale, Lilliana; Wertz, Anthony T.; Zhuang, Zenas; Kasturiarachi, Naomi S.; Pressimone, Catherine A.; Schiebel, Johnathon G.; Furbee, Emily C.; Grubb, Sarah R.; Warner, Marcie H.; Montgomery, Matthew T.; Garlena, Rebecca A.; Russell, Daniel A.; Jacobs-Sera, Deborah; Hatfull, Graham F.

    2016-01-01

    BetterKatz is a bacteriophage isolated from a soil sample collected in Pittsburgh, Pennsylvania using the host Gordonia terrae 3612. BetterKatz’s genome is 50,636 bp long and contains 75 predicted protein-coding genes, 35 of which have been assigned putative functions. BetterKatz is not closely related to other sequenced Gordonia phages. PMID:27516497

  6. Complete Genome Sequences of Four Isolates of Plutella xylostella Granulovirus

    Spence, Robert J.; Noune, Christopher; Hauxwell, Caroline

    2016-01-01

    Granuloviruses are widespread pathogens of Plutella xylostella L. (diamondback moth) and potential biopesticides for control of this global insect pest. We report the complete genomes of four Plutella xylostella granulovirus isolates from China, Malaysia, and Taiwan exhibiting pairs of noncoding, homologous repeat regions with significant sequence variation but equivalent length.

  7. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    RPS16 of eukaryote is a component of the 40S small ribosomal subunit encoded by RPS16 gene and is also a homolog of prokaryotic RPS9. The cDNA and genomic sequence of RPS16 was cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) using reverse transcription-polymerase chain ...

  8. Complete Genome Sequences of Four Isolates of Plutella xylostella Granulovirus.

    Spence, Robert J; Noune, Christopher; Hauxwell, Caroline

    2016-06-30

    Granuloviruses are widespread pathogens of Plutella xylostella L. (diamondback moth) and potential biopesticides for control of this global insect pest. We report the complete genomes of four Plutella xylostella granulovirus isolates from China, Malaysia, and Taiwan exhibiting pairs of noncoding, homologous repeat regions with significant sequence variation but equivalent length. Copyright © 2016 Spence et al.

  9. Complete Genome Sequence of Mycobacterium vaccae Type Strain ATCC 25954

    Ho, Y. S.; Adroub, S. A.; Abadi, Maram; Al Alwan, B.; Alkhateeb, R.; Gao, G.; Ragab, A.; Ali, Shahjahan; van Soolingen, D.; Bitter, W.; Pain, Arnab; Abdallah, A. M.

    2012-01-01

    Mycobacterium vaccae is a rapidly growing, nontuberculous Mycobacterium species that is generally not considered a human pathogen and is of major pharmaceutical interest as an immunotherapeutic agent. We report here the annotated genome sequence of the M. vaccae type strain, ATCC 25954.

  10. Complete Genome Sequence of Mycobacterium vaccae Type Strain ATCC 25954

    Ho, Y. S.

    2012-10-26

    Mycobacterium vaccae is a rapidly growing, nontuberculous Mycobacterium species that is generally not considered a human pathogen and is of major pharmaceutical interest as an immunotherapeutic agent. We report here the annotated genome sequence of the M. vaccae type strain, ATCC 25954.

  11. Templated sequence insertion polymorphisms in the human genome

    Onozawa, Masahiro; Aplan, Peter

    2016-11-01

    Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.

  12. Draft Genome Sequence of Campylobacter jejuni 11168H

    Macdonald, Sarah E.; Gundogdu, Ozan; Dorrell, Nick; Wren, Brendan W.; Blake, Damer

    2017-01-01

    ABSTRACT Campylobacter jejuni is the most prevalent cause of food-borne gastroenteritis in the developed world. The reference and original sequenced strain C. jejuni NCTC11168 has low levels of motility compared to clinical isolates. Here, we describe the draft genome of the laboratory derived hypermotile variant named 11168H. PMID:28153902

  13. Genome Sequence of Novel Human Parechovirus Type 17

    B?ttcher, Sindy; Obermeier, Patrick E.; Diedrich, Sabine; Kabor?, Yolande; D?Alfonso, Rossella; Pfister, Herbert; Kaiser, Rolf; Di Cristanziano, Veronica

    2017-01-01

    ABSTRACT Human parechoviruses (HPeV) circulate worldwide, causing a broad variety of symptoms, preferentially in early childhood. We report here the nearly complete genome sequence of a novel HPeV type, consisting of 7,062 nucleotides and encoding 2,179?amino acids. M36/CI/2014 was taxonomically classified as HPeV-17 by the picornavirus study group.

  14. The complete mitochondrial genome sequence of Diaphorina citri (Hemiptera: Psyllidae)

    The first complete mitochondrial genome (mitogenome) sequence of Asian citrus psyllid, Diaphorina citri (Hemiptera: Psyllidae), from Guangzhou, China is presented. The circular mitogenome is 14,996 bp in length with an A+T content of 74.5%, and contains 13 protein-coding genes (PCGs), 22 tRNA genes ...

  15. Genome Sequences of 19 Novel Erwinia amylovora Bacteriophages.

    Esplin, Ian N D; Berg, Jordan A; Sharma, Ruchira; Allen, Robert C; Arens, Daniel K; Ashcroft, Cody R; Bairett, Shannon R; Beatty, Nolan J; Bickmore, Madeline; Bloomfield, Travis J; Brady, T Scott; Bybee, Rachel N; Carter, John L; Choi, Minsey C; Duncan, Steven; Fajardo, Christopher P; Foy, Brayden B; Fuhriman, David A; Gibby, Paul D; Grossarth, Savannah E; Harbaugh, Kala; Harris, Natalie; Hilton, Jared A; Hurst, Emily; Hyde, Jonathan R; Ingersoll, Kayleigh; Jacobson, Caitlin M; James, Brady D; Jarvis, Todd M; Jaen-Anieves, Daniella; Jensen, Garrett L; Knabe, Bradley K; Kruger, Jared L; Merrill, Bryan D; Pape, Jenny A; Payne Anderson, Ashley M; Payne, David E; Peck, Malia D; Pollock, Samuel V; Putnam, Micah J; Ransom, Ethan K; Ririe, Devin B; Robinson, David M; Rogers, Spencer L; Russell, Kerri A; Schoenhals, Jonathan E; Shurtleff, Christopher A; Simister, Austin R; Smith, Hunter G; Stephenson, Michael B; Staley, Lyndsay A; Stettler, Jason M; Stratton, Mallorie L; Tateoka, Olivia B; Tatlow, P J; Taylor, Alexander S; Thompson, Suzanne E; Townsend, Michelle H; Thurgood, Trever L; Usher, Brittian K; Whitley, Kiara V; Ward, Andrew T; Ward, Megan E H; Webb, Charles J; Wienclaw, Trevor M; Williamson, Taryn L; Wells, Michael J; Wright, Cole K; Breakwell, Donald P; Hope, Sandra; Grose, Julianne H

    2017-11-16

    Erwinia amylovora is the causal agent of fire blight, a devastating disease affecting some plants of the Rosaceae family. We isolated bacteriophages from samples collected from infected apple and pear trees along the Wasatch Front in Utah. We announce 19 high-quality complete genome sequences of E. amylovora bacteriophages. Copyright © 2017 Esplin et al.

  16. Draft Genome Sequence of Mycobacterium chimaera Type Strain Fl-0169

    We report the draft genome sequence of the type strain Mycobacterium chimaera Fl-0169T, a member of the Mycobacterium avium complex (MAC). M. chimaera Fl-0169T was isolated from a patient in Italy and is highly similar to strains of M. chimaera isolated in Ireland, though Fl-016...

  17. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  18. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    Lamour, Kurt H [ORNL; McDonald, W Hayes [ORNL; Savidor, Alon [ORNL

    2006-01-01

    Genome sequences of the soybean pathogen, Phytophthora sojae, and the sudden oak death pathogen, Phytophthora ramorum, suggest a photosynthetic past and reveal recent massive expansion and diversification of potential pathogenicity gene families. Abstract: Draft genome sequences of the soybean pathogen, Phytophthora sojae, and the sudden oak death pathogen, Phytophthora ramorum, have been determined. O mycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms and the presence of many Phytophthora genes of probable phototroph origin support a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors and, in particular, a superfamily of 700 proteins with similarity to known o mycete avirulence genes.

  19. Complete genome sequence of Oceanithermus profundus type strain (506T)

    Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Zhang, Xiaojing [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Ruhl, Alina [U.S. Department of Energy, Joint Genome Institute; Mwirichia, Romano [University of Munster, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Land, Miriam L [ORNL

    2011-01-01

    Oceanithermus profundus Miroshnichenko et al. 2003 is the type species of the genus Oceanithermus, which belongs to the family Thermaceae. The genus currently comprises two species whose members are thermophilic and are able to reduce sulfur compounds and nitrite. The organism is adapted to the salinity of sea water, is able to utilize a broad range of carbohydrates, some proteinaceous substrates, organic acids and alcohols. This is the first completed genome sequence of a member of the genus Oceanithermus and the fourth sequence from the family Thermaceae. The 2,439,291 bp long genome with its 2,391 protein-coding and 54 RNA genes consists of one chromosome and a 135,351 bp long plasmid, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  20. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  1. Complete genome sequence of Marivirga tractuosa type strain (H-43).

    Pagani, Ioanna; Chertkov, Olga; Lapidus, Alla; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Nolan, Matt; Saunders, Elizabeth; Pitluck, Sam; Held, Brittany; Goodwin, Lynne; Liolios, Konstantinos; Ovchinikova, Galina; Ivanova, Natalia; Mavromatis, Konstantinos; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Jeffries, Cynthia D; Detter, John C; Han, Cliff; Tapia, Roxanne; Ngatchou-Djao, Olivier D; Rohde, Manfred; Göker, Markus; Spring, Stefan; Sikorski, Johannes; Woyke, Tanja; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2011-04-29

    Marivirga tractuosa (Lewin 1969) Nedashkovskaya et al. 2010 is the type species of the genus Marivirga, which belongs to the family Flammeovirgaceae. Members of this genus are of interest because of their gliding motility. The species is of interest because representative strains show resistance to several antibiotics, including gentamicin, kanamycin, neomycin, polymixin and streptomycin. This is the first complete genome sequence of a member of the family Flammeovirgaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,511,574 bp long chromosome and the 4,916 bp plasmid with their 3,808 protein-coding and 49 RNA genes are a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  2. Genome sequence of herpes simplex virus 1 strain KOS.

    Macdonald, Stuart J; Mostafa, Heba H; Morrison, Lynda A; Davido, David J

    2012-06-01

    Herpes simplex virus type 1 (HSV-1) strain KOS has been extensively used in many studies to examine HSV-1 replication, gene expression, and pathogenesis. Notably, strain KOS is known to be less pathogenic than the first sequenced genome of HSV-1, strain 17. To understand the genotypic differences between KOS and other phenotypically distinct strains of HSV-1, we sequenced the viral genome of strain KOS. When comparing strain KOS to strain 17, there are at least 1,024 small nucleotide polymorphisms (SNPs) and 172 insertions/deletions (indels). The polymorphisms observed in the KOS genome will likely provide insights into the genes, their protein products, and the cis elements that regulate the biology of this HSV-1 strain.

  3. Compartmentalization of HIV-1 within the female genital tract is due to monotypic and low-diversity variants not distinct viral populations.

    Bull, Marta; Learn, Gerald; Genowati, Indira; McKernan, Jennifer; Hitti, Jane; Lockhart, David; Tapia, Kenneth; Holte, Sarah; Dragavon, Joan; Coombs, Robert; Mullins, James; Frenkel, Lisa

    2009-09-22

    Compartmentalization of HIV-1 between the genital tract and blood was noted in half of 57 women included in 12 studies primarily using cell-free virus. To further understand differences between genital tract and blood viruses of women with chronic HIV-1 infection cell-free and cell-associated virus populations were sequenced from these tissues, reasoning that integrated viral DNA includes variants archived from earlier in infection, and provides a greater array of genotypes for comparisons. Multiple sequences from single-genome-amplification of HIV-1 RNA and DNA from the genital tract and blood of each woman were compared in a cross-sectional study. Maximum likelihood phylogenies were evaluated for evidence of compartmentalization using four statistical tests. Genital tract and blood HIV-1 appears compartmentalized in 7/13 women by >/=2 statistical analyses. These subjects' phylograms were characterized by low diversity genital-specific viral clades interspersed between clades containing both genital and blood sequences. Many of the genital-specific clades contained monotypic HIV-1 sequences. In 2/7 women, HIV-1 populations were significantly compartmentalized across all four statistical tests; both had low diversity genital tract-only clades. Collapsing monotypic variants into a single sequence diminished the prevalence and extent of compartmentalization. Viral sequences did not demonstrate tissue-specific signature amino acid residues, differential immune selection, or co-receptor usage. In women with chronic HIV-1 infection multiple identical sequences suggest proliferation of HIV-1-infected cells, and low diversity tissue-specific phylogenetic clades are consistent with bursts of viral replication. These monotypic and tissue-specific viruses provide statistical support for compartmentalization of HIV-1 between the female genital tract and blood. However, the intermingling of these clades with clades comprised of both genital and blood sequences and the absence

  4. Draft genome sequence of the Algerian bee Apis mellifera intermissa

    Nizar Jamal Haddad

    2015-06-01

    Full Text Available Apis mellifera intermissa is the native honeybee subspecies of Algeria. A. m. intermissa occurs in Tunisia, Algeria and Morocco, between the Atlas and the Mediterranean and Atlantic coasts. This bee is very important due to its high ability to adapt to great variations in climatic conditions and due to its preferable cleaning behavior. Here we report the draft genome sequence of this honey bee, its Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JSUV00000000. The 240-Mb genome is being annotated and analyzed. Comparison with the genome of other Apis mellifera sub-species promises to yield insights into the evolution of adaptations to high temperature and resistance to Varroa parasite infestation.

  5. Complete genome sequence of 'Thermobaculum terrenum' type strain (YNP1).

    Kiss, Hajnalka; Cleland, David; Lapidus, Alla; Lucas, Susan; Del Rio, Tijana Glavina; Nolan, Matt; Tice, Hope; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Lu, Megan; Brettin, Thomas; Detter, John C; Göker, Markus; Tindall, Brian J; Beck, Brian; McDermott, Timothy R; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Cheng, Jan-Fang

    2010-10-27

    'Thermobaculum terrenum' Botero et al. 2004 is the sole species within the proposed genus 'Thermobaculum'. Strain YNP1(T) is the only cultivated member of an acid tolerant, extremely thermophilic species belonging to a phylogenetically isolated environmental clone group within the phylum Chloroflexi. At present, the name 'Thermobaculum terrenum' is not yet validly published as it contravenes Rule 30 (3a) of the Bacteriological Code. The bacterium was isolated from a slightly acidic extreme thermal soil in Yellowstone National Park, Wyoming (USA). Depending on its final taxonomic allocation, this is likely to be the third completed genome sequence of a member of the class Thermomicrobia and the seventh type strain genome from the phylum Chloroflexi. The 3,101,581 bp long genome with its 2,872 protein-coding and 58 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  6. Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

    Qingyu Chen

    Full Text Available First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases.We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.

  7. Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

    Chen, Qingyu; Zobel, Justin; Zhang, Xiuzhen; Verspoor, Karin

    2016-01-01

    First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases. We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.

  8. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins.  Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...

  9. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  10. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  11. Complete Genome Sequence of EtG, the First Phage Sequenced from Erwinia tracheiphila.

    Andrade-Domínguez, Andrés; Kolter, Roberto; Shapiro, Lori R

    2018-02-22

    Erwinia tracheiphila is the causal agent of bacterial wilt of cucurbits. Here, we report the genome sequence of the temperate phage EtG, which was isolated from an E. tracheiphila -infected cucumber plant. Phage EtG has a linear 30,413-bp double-stranded DNA genome with cohesive ends and 45 predicted open reading frames. Copyright © 2018 Andrade-Domínguez et al.

  12. ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS

    Alves-Ferreira Marcelo

    2008-09-01

    Full Text Available Abstract Background Genome survey sequences (GSS offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers. Results We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen Leishmania braziliensis, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an Escheria coli. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis. Conclusion The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a L. braziliensis GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the E. coli K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties

  13. Complete genome sequence of Actinosynnema mirum type strain (101T)

    Land, Miriam; Lapidus, Alla; Mayilraj, Shanmugam; Chen, Feng; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Chertkov, Olga; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Rohde, Manfred; Goker, Markus; Pati, Amrita; Ivanova, Natalia; Mavrommatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia; Brettin, Thomas; Detter, John C.; Han, Cliff; Chain, Patrick; Tindall, Brian; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Actinosynnema mirum Hasegawa et al. 1978 is the type species of the genus, and is of phylogenetic interest because of its central phylogenetic location in the Actino-synnemataceae, a rapidly growing family within the actinobacterial suborder Pseudo-nocardineae. A. mirum is characterized by its motile spores borne on synnemata and as a producer of nocardicin antibiotics. It is capable of growing aerobically and under a moderate CO2 atmosphere. The strain is a Gram-positive, aerial and substrate mycelium producing bacterium, originally isolated from a grass blade collected from the Raritan River, New Jersey. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of a member of the family Actinosynnemataceae, and only the second sequence from the actinobacterial suborder Pseudonocardineae. The 8,248,144 bp long single replicon genome with its 7100 protein-coding and 77 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  14. The complete mitochondrial genome sequence of Eimeria magna (Apicomplexa: Coccidia).

    Tian, Si-Qin; Cui, Ping; Fang, Su-Fang; Liu, Guo-Hua; Wang, Chun-Ren; Zhu, Xing-Quan

    2015-01-01

    In the present study, we determined the complete mitochondrial DNA (mtDNA) sequence of Eimeria magna from rabbits for the first time, and compared its gene contents and genome organizations with that of seven Eimeria spp. from domestic chickens. The size of the complete mt genome sequence of E. magna is 6249 bp, which consists of 3 protein-coding genes (cytb, cox1 and cox3), 12 gene fragments for the large subunit (LSU) rRNA, and 7 gene fragments for the small subunit (SSU) rRNA, without transfer RNA genes, in accordance with that of Eimeria spp. from chickens. The putative direction of translation for three genes (cytb, cox1 and cox3) was the same as those of Eimeria species from domestic chickens. The content of A + T is 65.16% for E. magna mt genome (29.73% A, 35.43% T, 17.09 G and 17.75% C). The E. magna mt genome sequence provides novel mtDNA markers for studying the molecular epidemiology and population genetics of Eimeria spp. and has implications for the molecular diagnosis and control of rabbit coccidiosis.

  15. Insights into hominid evolution from the gorilla genome sequence

    Scally, Aylwyn; Dutheil, Julien Y.; Hillier, LaDeana W.; Jordan, Greg E.; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H.; Schwalie, Petra C.; Tang, Y. Amy; Ward, Michelle C.; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N.; Ayub, Qasim; Ball, Edward V.; Beal, Kathryn; Bradley, Brenda J.; Chen, Yuan; Clee, Chris M.; Fitzgerald, Stephen; Graves, Tina A.; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K.; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C.; Munch, Kasper; O’Connor, Timothy D.; Phillips, Andrew D.; Prado-Martinez, Javier; Rogers, Anthony S.; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T.; Stenson, Peter D.; Turner, Daniel J.; Vigilant, Linda; Vilella, Albert J.; Whitener, Weldon; Zhu, Baoli; Cooper, David N.; de Jong, Pieter; Dermitzakis, Emmanouil T.; Eichler, Evan E.; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I.; Ning, Zemin; Odom, Duncan T.; Ponting, Chris P.; Quail, Michael A.; Ryder, Oliver A.; Searle, Stephen M.; Warren, Wesley C.; Wilson, Richard K.; Schierup, Mikkel H.; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard

    2012-01-01

    Summary Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555

  16. Next-Generation Sequencing and Genome Editing in Plant Virology

    Ahmed Hadidi

    2016-08-01

    Full Text Available Next-generation sequencing (NGS has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21-24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus; beet curly top virus and beet severe curly top virus (curtovirus; and bean yellow dwarf virus (mastrevirus. The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus and cucumber vein yellowing virus (ipomovirus, family, Potyviridae by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology.Keywords: Next-generation sequencing, NGS, plant virology, plant viruses, viroids, resistance to plant viruses by CRISPR-Cas9

  17. Controversy and debate on clinical genomics sequencing-paper 2: clinical genome-wide sequencing: don't throw out the baby with the bathwater!

    Adam, Shelin; Friedman, Jan M

    2017-12-01

    Genome-wide (exome or whole genome) sequencing with appropriate genetic counseling should be considered for any patient with a suspected Mendelian disease that has not been identified by conventional testing. Clinical genome-wide sequencing provides a powerful and effective means of identifying specific genetic causes of serious disease and improving clinical care. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. DBR1 siRNA inhibition of HIV-1 replication

    Naidu Yathi

    2005-10-01

    Full Text Available Abstract Background HIV-1 and all retroviruses are related to retroelements of simpler organisms such as the yeast Ty elements. Recent work has suggested that the yeast retroelement Ty1 replicates via an unexpected RNA lariat intermediate in cDNA synthesis. The putative genomic RNA lariat intermediate is formed by a 2'-5' phosphodiester bond, like that found in pre-mRNA intron lariats and it facilitates the minus-strand template switch during cDNA synthesis. We hypothesized that HIV-1 might also form a genomic RNA lariat and therefore that siRNA-mediated inhibition of expression of the human RNA lariat de-branching enzyme (DBR1 expression would specifically inhibit HIV-1 replication. Results We designed three short interfering RNA (siRNA molecules targeting DBR1, which were capable of reducing DBR1 mRNA expression by 80% and did not significantly affect cell viability. We assessed HIV-1 replication in the presence of DBR1 siRNA and found that DBR1 knockdown led to decreases in viral cDNA and protein production. These effects could be reversed by cotransfection of a DBR1 cDNA indicating that the inhibition of HIV-1 replication was a specific effect of DBR1 underexpression. Conclusion These data suggest that DBR1 function may be needed to debranch a putative HIV-1 genomic RNA lariat prior to completion of reverse transcription.

  19. Draft genome sequence of Acidithiobacillus ferrooxidans YQH-1

    Lei Yan

    2015-12-01

    Full Text Available Acidithiobacillus ferrooxidans YQH-1 is a moderate acidophilic bacterium isolated from a river in a volcano of Northeast China. Here, we describe the draft genome of strain YQH-1, which was assembled into 123 contigs containing 3,111,222 bp with a G + C content of 58.63%. A large number of genes related to carbon dioxide fixation, dinitrogen fixation, pH tolerance, heavy metal detoxification, and oxidative stress defense were detected. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LJBT00000000.

  20. The complete chloroplast genome sequence of Hibiscus syriacus.

    Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

    2016-09-01

    The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes.

  1. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    Tyler, Brett M.; Tripathy, Sucheta; Zhang, Xuemin; Dehal, Paramvir; Jiang, Rays H. Y.; Aerts, Andrea; Arredondo, Felipe D.; Baxter, Laura; Bensasson, Douda; Beynon, JIm L.; Chapman, Jarrod; Damasceno, Cynthia M. B.; Dorrance, Anne E.; Dou, Daolong; Dickerman, Allan W.; Dubchak, Inna L.; Garbelotto, Matteo; Gijzen, Mark; Gordon, Stuart G.; Govers, Francine; Grunwald, NIklaus J.; Huang, Wayne; Ivors, Kelly L.; Jones, Richard W.; Kamoun, Sophien; Krampis, Konstantinos; Lamour, Kurt H.; Lee, Mi-Kyung; McDonald, W. Hayes; Medina, Monica; Meijer, Harold J. G.; Nordberg, Erik K.; Maclean, Donald J.; Ospina-Giraldo, Manuel D.; Morris, Paul F.; Phuntumart, Vipaporn; Putnam, Nicholas J.; Rash, Sam; Rose, Jocelyn K. C.; Sakihama, Yasuko; Salamov, Asaf A.; Savidor, Alon; Scheuring, Chantel F.; Smith, Brian M.; Sobral, Bruno W. S.; Terry, Astrid; Torto-Alalibo, Trudy A.; Win, Joe; Xu, Zhanyou; Zhang, Hongbin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Boore, Jeffrey L.

    2006-04-17

    Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oömycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes.

  2. The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus)

    Miller, Webb; Drautz, Daniela I; Janecka, Jan E

    2009-01-01

    We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support for the ......We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support...... for the thylacine's basal position in Dasyuromorphia, aided by mitochondrial genome sequence that we generated from the extant numbat (Myrmecobius fasciatus). Surprisingly, both of our thylacine sequences differ by 11%-15% from putative thylacine mitochondrial genes in GenBank, with one of our samples originating...... at a very low genetic diversity shortly before extinction. Despite the samples' heavy contamination with bacterial and human DNA and their temperate storage history, we estimate that as much as one-third of the total DNA in each sample is from the thylacine. The microbial content of the two thylacine...

  3. Genetic Characterization of a Panel of Diverse HIV-1 Isolates at Seven International Sites.

    Bhavna Hora

    Full Text Available HIV-1 subtypes and drug resistance are routinely tested by many international surveillance groups. However, results from different sites often vary. A systematic comparison of results from multiple sites is needed to determine whether a standardized protocol is required for consistent and accurate data analysis. A panel of well-characterized HIV-1 isolates (N = 50 from the External Quality Assurance Program Oversight Laboratory (EQAPOL was assembled for evaluation at seven international sites. This virus panel included seven subtypes, six circulating recombinant forms (CRFs, nine unique recombinant forms (URFs and three group O viruses. Seven viruses contained 10 major drug resistance mutations (DRMs. HIV-1 isolates were prepared at a concentration of 107 copies/ml and compiled into blinded panels. Subtypes and DRMs were determined with partial or full pol gene sequences by conventional Sanger sequencing and/or Next Generation Sequencing (NGS. Subtype and DRM results were reported and decoded for comparison with full-length genome sequences generated by EQAPOL. The partial pol gene was amplified by RT-PCR and sequenced for 89.4%-100% of group M viruses at six sites. Subtyping results of majority of the viruses (83%-97.9% were correctly determined for the partial pol sequences. All 10 major DRMs in seven isolates were detected at these six sites. The complete pol gene sequence was also obtained by NGS at one site. However, this method missed six group M viruses and sequences contained host chromosome fragments. Three group O viruses were only characterized with additional group O-specific RT-PCR primers employed by one site. These results indicate that PCR protocols and subtyping tools should be standardized to efficiently amplify diverse viruses and more consistently assign virus genotypes, which is critical for accurate global subtype and drug resistance surveillance. Targeted NGS analysis of partial pol sequences can serve as an alternative

  4. Reference-quality genome sequence of Aegilops tauschii, the source of wheat D genome, shows that recombination shapes genome structure and evolution

    Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...

  5. The porcine circovirus type 1 capsid gene promoter improves antigen expression and immunogenicity in a HIV-1 plasmid vaccine

    Burger Marieta

    2011-02-01

    Full Text Available Abstract Background One of the promising avenues for development of vaccines against Human immunodeficiency virus type 1 (HIV-1 and other human pathogens is the use of plasmid-based DNA vaccines. However, relatively large doses of plasmid must be injected for a relatively weak response. We investigated whether genome elements from Porcine circovirus type 1 (PCV-1, an apathogenic small ssDNA-containing virus, had useful expression-enhancing properties that could allow dose-sparing in a plasmid vaccine. Results The linearised PCV-1 genome inserted 5' of the CMV promoter in the well-characterised HIV-1 plasmid vaccine pTHgrttnC increased expression of the polyantigen up to 2-fold, and elicited 3-fold higher CTL responses in mice at 10-fold lower doses than unmodified pTHgrttnC. The PCV-1 capsid gene promoter (Pcap alone was equally effective. Enhancing activity was traced to a putative composite host transcription factor binding site and a "Conserved Late Element" transcription-enhancing sequence previously unidentified in circoviruses. Conclusions We identified a novel PCV-1 genome-derived enhancer sequence that significantly increased antigen expression from plasmids in in vitro assays, and improved immunogenicity in mice of the HIV-1 subtype C vaccine plasmid, pTHgrttnC. This should allow significant dose sparing of, or increased responses to, this and other plasmid-based vaccines. We also report investigations of the potential of other circovirus-derived sequences to be similarly used.

  6. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  7. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  8. Phylogenetic analysis of HIV-1 pol gene: first subgenomic evidence of CRF29-BF among Iranian HIV-1 patients

    Kazem Baesi

    2014-09-01

    Full Text Available Objective: To identify the dominant subtype among the HIV-1 strains circulation in Iran. Methods: In this cross sectional study 100 HIV positive patients participated. HIV-1 RNA was extracted from plasma. RT nested-PCR was performed and the final products were sequenced and phylogenetically analyzed; reference sequences were downloaded from Los Alamos, aligned with Iranian pol sequences in the study and analyzed by neighbor-joining method. Results: The results of the phylogenetic analysis showed that HIV-1 subtype CRF-35AD was the dominant subtype among HIV-1 infected patients in Iran; this analysis also suggested a new circulating recombinant form that had not previously been identified in Iran: CRF-29BF. Conclusions: The impact of HIV diversity on pathogenesis, transmission and clinical management have been discussed in different studies; therefore, analyses of HIV genetic diversity is required to design effective antiretroviral strategies for different HIV subtypes.

  9. HIV-1 phylogenetic analysis shows HIV-1 transits through the meninges to brain and peripheral tissues.

    Lamers, Susanna L; Gray, Rebecca R; Salemi, Marco; Huysentruyt, Leanne C; McGrath, Michael S

    2011-01-01

    Brain infection by the human immunodeficiency virus type 1 (HIV-1) has been investigated in many reports with a variety of conclusions concerning the time of entry and degree of viral compartmentalization. To address these diverse findings, we sequenced HIV-1 gp120 clones from a wide range of brain, peripheral and meningeal tissues from five patients who died from several HIV-1 associated disease pathologies. High-resolution phylogenetic analysis confirmed previous studies that showed a significant degree of compartmentalization in brain and peripheral tissue subpopulations. Some intermixing between the HIV-1 subpopulations was evident, especially in patients that died from pathologies other than HIV-associated dementia. Interestingly, the major tissue harboring virus from both the brain and peripheral tissues was the meninges. These results show that (1) HIV-1 is clearly capable of migrating out of the brain, (2) the meninges are the most likely primary transport tissues, and (3) infected brain macrophages comprise an important HIV reservoir during highly active antiretroviral therapy. Copyright © 2010 Elsevier B.V. All rights reserved.

  10. Human genetics and genomics a decade after the release of the draft sequence of the human genome

    2011-01-01

    Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605

  11. HIV-1 molecular epidemiology among newly diagnosed HIV-1 individuals in Hebei, a low HIV prevalence province in China.

    Xinli Lu

    Full Text Available New human immunodeficiency virus type 1 (HIV-1 diagnoses are increasing rapidly in Hebei. The aim of this study presents the most extensive HIV-1 molecular epidemiology investigation in Hebei province in China thus far. We have carried out the most extensive systematic cross-sectional study based on newly diagnosed HIV-1 positive individuals in 2013, and characterized the molecular epidemiology of HIV-1 based on full length gag-partial pol gene sequences in the whole of Hebei. Nine HIV-1 genotypes based on full length gag-partial pol gene sequence were identified among 610 newly diagnosed naïve individuals. The four main genotypes were circulating recombinant form (CRF01_AE (53.4%, CRF07_BC (23.4%, subtype B (15.9%, and unique recombinant forms URFs (4.9%. Within 1 year, three new genotypes (subtype A1, CRF55_01B, CRF65_cpx, unknown before in Hebei, were first found among men who have sex with men (MSM. All nine genotypes were identified in the sexually contracted HIV-1 population. Among 30 URFs, six recombinant patterns were revealed, including CRF01_AE/BC (40.0%, CRF01_AE/B (23.3%, B/C (16.7%, CRF01_AE/C (13.3%, CRF01_AE/B/A2 (3.3% and CRF01_AE/BC/A2 (3.3%, plus two potential CRFs. This study elucidated the complicated characteristics of HIV-1 molecular epidemiology in a low HIV-1 prevalence northern province of China and revealed the high level of HIV-1 genetic diversity. All nine HIV-1 genotypes circulating in Hebei have spread out of their initial risk groups into the general population through sexual contact, especially through MSM. This highlights the urgency of HIV prevention and control in China.

  12. HIV-1 molecular epidemiology among newly diagnosed HIV-1 individuals in Hebei, a low HIV prevalence province in China.

    Lu, Xinli; Kang, Xianjiang; Liu, Yongjian; Cui, Ze; Guo, Wei; Zhao, Cuiying; Li, Yan; Chen, Suliang; Li, Jingyun; Zhang, Yuqi; Zhao, Hongru

    2017-01-01

    New human immunodeficiency virus type 1 (HIV-1) diagnoses are increasing rapidly in Hebei. The aim of this study presents the most extensive HIV-1 molecular epidemiology investigation in Hebei province in China thus far. We have carried out the most extensive systematic cross-sectional study based on newly diagnosed HIV-1 positive individuals in 2013, and characterized the molecular epidemiology of HIV-1 based on full length gag-partial pol gene sequences in the whole of Hebei. Nine HIV-1 genotypes based on full length gag-partial pol gene sequence were identified among 610 newly diagnosed naïve individuals. The four main genotypes were circulating recombinant form (CRF)01_AE (53.4%), CRF07_BC (23.4%), subtype B (15.9%), and unique recombinant forms URFs (4.9%). Within 1 year, three new genotypes (subtype A1, CRF55_01B, CRF65_cpx), unknown before in Hebei, were first found among men who have sex with men (MSM). All nine genotypes were identified in the sexually contracted HIV-1 population. Among 30 URFs, six recombinant patterns were revealed, including CRF01_AE/BC (40.0%), CRF01_AE/B (23.3%), B/C (16.7%), CRF01_AE/C (13.3%), CRF01_AE/B/A2 (3.3%) and CRF01_AE/BC/A2 (3.3%), plus two potential CRFs. This study elucidated the complicated characteristics of HIV-1 molecular epidemiology in a low HIV-1 prevalence northern province of China and revealed the high level of HIV-1 genetic diversity. All nine HIV-1 genotypes circulating in Hebei have spread out of their initial risk groups into the general population through sexual contact, especially through MSM. This highlights the urgency of HIV prevention and control in China.

  13. Genome Sequence of the Probiotic Strain Lactobacillus rhamnosus (Formerly Lactobacillus casei) LOCK900

    Aleksandrzak-Piekarczyk, Tamara; Koryszewska-Bagi?ska, Anna; Bardowski, Jacek

    2013-01-01

    Lactobacillus rhamnosus LOCK900 fulfills the criteria required for probiotic strains. In this study, we report a whole-genome sequence of this isolate and compare it with other L.?rhamnosus complete genome sequences already published.

  14. Development of an epitope-based HIV-1 vaccine strategy from HIV-1 lipopeptide to dendritic-based vaccines.

    Surenaud, Mathieu; Lacabaratz, Christine; Zurawski, Gérard; Lévy, Yves; Lelièvre, Jean-Daniel

    2017-10-01

    Development of a safe, effective and globally affordable Human Immunodeficiency Virus strain 1 (HIV-1) vaccine offers the best hope for future control of the HIV-1 pandemic. However, with the exception of the recent RV144 trial, which elicited a modest level of protection against infection, no vaccine candidate has shown efficacy in preventing HIV-1 infection or in controlling virus replication in humans. There is also a great need for a successful immunotherapeutic vaccine since combination antiretroviral therapy (cART) does not eliminate the reservoir of HIV-infected cells. But to date, no vaccine candidate has proven to significantly alter the natural history of an individual with HIV-1 infection. Areas covered: For over 25 years, the ANRS (France Recherche Nord&Sud Sida-HIV hépatites) has been committed to an original program combining basic science and clinical research developing an epitope-based vaccine strategy to induce a multiepitopic cellular response against HIV-1. This review describes the evolution of concepts, based on strategies using HIV-1 lipopeptides towards the use of dendritic cell (DC) manipulation. Expert commentary: Understanding the crucial role of DCs in immune responses allowed moving from the non-specific administration of HIV-1 sequences with lipopeptides to DC-based vaccines. These DC-targeting strategies should improve HIV-1 vaccine efficacy.

  15. Genome sequence and description of Anaerosalibacter massiliensis sp. nov.

    N. Dione

    2016-03-01

    Full Text Available Anaerosalibacter massiliensis sp. nov. strain ND1T (= CSUR P762 = DSM 27308 is the type strain of A. massiliensis sp. nov., a new species within the genus Anaerosalibacter. This strain, the genome of which is described here, was isolated from the faecal flora of a 49-year-old healthy Brazilian man. Anaerosalibacter massiliensis is a Gram-positive, obligate anaerobic rod and member of the family Clostridiaceae. With the complete genome sequence and annotation, we describe here the features of this organism. The 3 197 911 bp long genome (one chromosome but no plasmid contains 3271 protein-coding and 62 RNA genes, including six rRNA genes.

  16. The complete chloroplast genome sequence of Dendrobium nobile.

    Yan, Wenjin; Niu, Zhitao; Zhu, Shuying; Ye, Meirong; Ding, Xiaoyu

    2016-11-01

    The complete chloroplast (cp) genome sequence of Dendrobium nobile, an endangered and traditional Chinese medicine with important economic value, is presented in this article. The total genome size is 150,793 bp, containing a large single copy (LSC) region (84,939 bp) and a small single copy region (SSC) (13,310 bp) which were separated by two inverted repeat (IRs) regions (26,272 bp). The overall GC contents of the plastid genome were 38.8%. In total, 130 unique genes were annotated and they were consisted of 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Fourteen genes contained one or two introns.

  17. Targeted sequencing of large genomic regions with CATCH-Seq.

    Kenneth Day

    Full Text Available Current target enrichment systems for large-scale next-generation sequencing typically require synthetic oligonucleotides used as capture reagents to isolate sequences of interest. The majority of target enrichment reagents are focused on gene coding regions or promoters en masse. Here we introduce development of a customizable targeted capture system using biotinylated RNA probe baits transcribed from sheared bacterial artificial chromosome clone templates that enables capture of large, contiguous blocks of the genome for sequencing applications. This clone adapted template capture hybridization sequencing (CATCH-Seq procedure can be used to capture both coding and non-coding regions of a gene, and resolve the boundaries of copy number variations within a genomic target site. Furthermore, libraries constructed with methylated adapters prior to solution hybridization also enable targeted bisulfite sequencing. We applied CATCH-Seq to diverse targets ranging in size from 125 kb to 3.5 Mb. Our approach provides a simple and cost effective alternative to other capture platforms because of template-based, enzymatic probe synthesis and the lack of oligonucleotide design costs. Given its similarity in procedure, CATCH-Seq can also be performed in parallel with commercial systems.

  18. Construction of an integrated database to support genomic sequence analysis

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  19. Combined evidence annotation of transposable elements in genome sequences.

    Hadi Quesneville

    2005-07-01

    Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other

  20. The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).

    Choi, Kyoung Su; Park, SeonJoo

    2016-09-01

    The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus.

  1. Hyperthermia stimulates HIV-1 replication.

    Ferdinand Roesch

    Full Text Available HIV-infected individuals may experience fever episodes. Fever is an elevation of the body temperature accompanied by inflammation. It is usually beneficial for the host through enhancement of immunological defenses. In cultures, transient non-physiological heat shock (42-45°C and Heat Shock Proteins (HSPs modulate HIV-1 replication, through poorly defined mechanisms. The effect of physiological hyperthermia (38-40°C on HIV-1 infection has not been extensively investigated. Here, we show that culturing primary CD4+ T lymphocytes and cell lines at a fever-like temperature (39.5°C increased the efficiency of HIV-1 replication by 2 to 7 fold. Hyperthermia did not facilitate viral entry nor reverse transcription, but increased Tat transactivation of the LTR viral promoter. Hyperthermia also boosted HIV-1 reactivation in a model of latently-infected cells. By imaging HIV-1 transcription, we further show that Hsp90 co-localized with actively transcribing provirus, and this phenomenon was enhanced at 39.5°C. The Hsp90 inhibitor 17-AAG abrogated the increase of HIV-1 replication in hyperthermic cells. Altogether, our results indicate that fever may directly stimulate HIV-1 replication, in a process involving Hsp90 and facilitation of Tat-mediated LTR activity.

  2. Single-Genome Sequencing of Hepatitis C Virus in Donor-Recipient Pairs Distinguishes Modes and Models of Virus Transmission and Early Diversification.

    Li, Hui; Stoddard, Mark B; Wang, Shuyi; Giorgi, Elena E; Blair, Lily M; Learn, Gerald H; Hahn, Beatrice H; Alter, Harvey J; Busch, Michael P; Fierer, Daniel S; Ribeiro, Ruy M; Perelson, Alan S; Bhattacharya, Tanmoy; Shaw, George M

    2016-01-01

    Despite the recent development of highly effective anti-hepatitis C virus (HCV) drugs, the global burden of this pathogen remains immense. Control or eradication of HCV will likely require the broad application of antiviral drugs and development of an effective vaccine. A precise molecular identification of transmitted/founder (T/F) HCV genomes that lead to productive clinical infection could play a critical role in vaccine research, as it has for HIV-1. However, the replication schema of these two RNA viruses differ substantially, as do viral responses to innate and adaptive host defenses. These differences raise questions as to the certainty of T/F HCV genome inferences, particularly in cases where multiple closely related sequence lineages have been observed. To clarify these issues and distinguish between competing models of early HCV diversification, we examined seven cases of acute HCV infection in humans and chimpanzees, including three examples of virus transmission between linked donors and recipients. Using single-genome sequencing (SGS) of plasma vRNA, we found that inferred T/F sequences in recipients were identical to viral sequences in their respective donors. Early in infection, HCV genomes generally evolved according to a simple model of random evolution where the coalescent corresponded to the T/F sequence. Closely related sequence lineages could be explained by high multiplicity infection from a donor whose viral sequences had undergone a pretransmission bottleneck due to treatment, immune selection, or recent infection. These findings validate SGS, together with mathematical modeling and phylogenetic analysis, as a novel strategy to infer T/F HCV genome sequences. Despite the recent development of highly effective, interferon-sparing anti-hepatitis C virus (HCV) drugs, the global burden of this pathogen remains immense. Control or eradication of HCV will likely require the broad application of antiviral drugs and the development of an effective

  3. Genome sequence of carboxylesterase, carboxylase and xylose isomerase producing alkaliphilic haloarchaeon Haloterrigena turkmenica WANU15

    Samy Selim

    2016-03-01

    Full Text Available We report draft genome sequence of Haloterrigena turkmenica strain WANU15, isolated from Soda Lake. The draft genome size is 2,950,899 bp with a G + C content of 64% and contains 49 RNA sequence. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LKCV00000000. Keywords: Soda Lake, Haloterrigena turkmenica, Carboxylesterase, Carboxylase, Xylose isomerase, Whole genome sequencing

  4. Enrichment of intersubtype HIV-1 recombinants in a dual infection system using HIV-1 strain-specific siRNAs

    2011-01-01

    Background Intersubtype HIV-1 recombinants in the form of unique or stable circulating recombinants forms (CRFs) are responsible for over 20% of infections in the worldwide epidemic. Mechanisms controlling the generation, selection, and transmission of these intersubtype HIV-1 recombinants still require further investigation. All intersubtype HIV-1 recombinants are generated and evolve from initial dual infections, but are difficult to identify in the human population. In vitro studies provide the most practical system to study mechanisms, but the recombination rates are usually very low in dual infections with primary HIV-1 isolates. This study describes the use of HIV-1 isolate-specific siRNAs to enrich intersubtype HIV-1 recombinants and inhibit the parental HIV-1 isolates from a dual infection. Results Following a dual infection with subtype A and D primary HIV-1 isolates and two rounds of siRNA treatment, nearly 100% of replicative virus was resistant to a siRNA specific for an upstream target sequence in the subtype A envelope (env) gene as well as a siRNA specific for a downstream target sequence in the subtype D env gene. Only 20% (10/50) of the replicating virus had nucleotide substitutions in the siRNA-target sequence whereas the remaining 78% (39/50) harbored a recombination breakpoint that removed both siRNA target sequences, and rendered the intersubtype D/A recombinant virus resistant to the dual siRNA treatment. Since siRNAs target the newly transcribed HIV-1 mRNA, the siRNAs only enrich intersubtype env recombinants and do not influence the recombination process during reverse transcription. Using this system, a strong bias is selected for recombination breakpoints in the C2 region, whereas other HIV-1 env regions, most notably the hypervariable regions, were nearly devoid of intersubtype recombination breakpoints. Sequence conservation plays an important role in selecting for recombination breakpoints, but the lack of breakpoints in many conserved

  5. Functional annotation from the genome sequence of the giant panda

    Huo, Tong; Zhang, Yinjie; Lin, Jianping

    2012-01-01

    The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided in...

  6. Complete Genome Sequence of Mycobacterium xenopi Type Strain RIVM700367

    Abdallah, A. M.; Rashid, M.; Adroub, S. A.; Elabdalaoui, H.; Ali, Shahjahan; van Soolingen, D.; Bitter, W.; Pain, Arnab

    2012-01-01

    Mycobacterium xenopi is a slow-growing, thermophilic, water-related Mycobacterium species. Like other nontuberculous mycobacteria, M. xenopi more commonly infects humans with altered immune function, such as chronic obstructive pulmonary disease patients. It is considered clinically relevant in a significant proportion of the patients from whom it is isolated. We report here the whole genome sequence of M. xenopi type strain RIVM700367.

  7. Mitochondrial genome sequence of the Tibetan wild ass (Equus kiang).

    Luo, Yongjun; Chen, Yu; Liu, Fuyu; Jiang, Chunhua; Gao, Yuqi

    2011-02-01

    The Tibetan wild ass, or kiang (Equus kiang) is endemic to the cold and hypoxic (4000-7000 m above sea level) climates of the montane and alpine grasslands of the Tibetan Plateau. We report here the complete nucleotide sequence of the E. kiang mitochondrial genome. Our results show that E. kiang mitochondrial DNA is 16,634 bp long, and predicted to encode all the 37 genes that are typical for vertebrates.

  8. Complete Genome Sequence of Mycobacterium xenopi Type Strain RIVM700367

    Abdallah, A. M.

    2012-05-24

    Mycobacterium xenopi is a slow-growing, thermophilic, water-related Mycobacterium species. Like other nontuberculous mycobacteria, M. xenopi more commonly infects humans with altered immune function, such as chronic obstructive pulmonary disease patients. It is considered clinically relevant in a significant proportion of the patients from whom it is isolated. We report here the whole genome sequence of M. xenopi type strain RIVM700367.

  9. A Genome Sequencing Program for Novel Undiagnosed Diseases

    Bloss, Cinnamon S.; Scott-Van Zeeland, Ashley A.; Topol, Sarah E.; Darst, Burcu F.; Boeldt, Debra L.; Erikson, Galina A.; Bethel, Kelly J.; Bjork, Robert L.; Friedman, Jennifer R.; Hwynn, Nelson; Patay, Bradley A.; Pockros, Paul J.; Scott, Erick R.; Simon, Ronald A.; Williams, Gary W.

    2015-01-01

    Purpose The Scripps Idiopathic Diseases of huMan (IDIOM) study aims to discover novel gene-disease relationships and provide molecular genetic diagnosis and treatment guidance for individuals with novel diseases using genome sequencing integrated with clinical assessment and multidisciplinary case review. Methods Here we describe the IDIOM study operational protocol and initial results. Results 121 cases underwent first tier review by the principal investigators to determine if the primary in...

  10. Whole Genome Sequencing of a Healthy Aging Cohort

    Erikson, Galina A.; Bodian, Dale L.; Rueda, Manuel; Molparia, Bhuvan; Scott, Erick R.; Scott-Van Zeeland, Ashley A.; Topol, Sarah E.; Wineinger, Nathan E.; Niederhuber, John E.; Topol, Eric J.; Torkamani, Ali

    2016-01-01

    Studies of long-lived individuals have revealed few genetic mechanisms for protection against age-associated disease. Therefore, we pursued genome sequencing of a related phenotype – healthy aging – to understand the genetics of disease-free aging without medical intervention. In contrast with studies of exceptional longevity, usually focused on centenarians, healthy aging is not associated with known longevity variants but is associated with reduced genetic susceptibility to Alzheimer and co...

  11. Complete genome sequence of Marivirga tractuosa type strain (H-43).

    Pagani, Ioanna; Chertkov, Olga; Lapidus, Alla; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Nolan, Matt; Saunders, Elizabeth; Pitluck, Sam; Held, Brittany; Goodwin, Lynne; Liolios, Konstantinos; Ovchinikova, Galina

    2011-01-01

    Marivirga tractuosa (Lewin 1969) Nedashkovskaya et al. 2010 is the type species of the genus Marivirga, which belongs to the family Flammeovirgaceae. Members of this genus are of interest because of their gliding motility. The species is of interest because representative strains show resistance to several antibiotics, including gentamicin, kanamycin, neomycin, polymixin and streptomycin. This is the first complete genome sequence of a member of the family Flammeovirgaceae. Here we describe t...

  12. Genome sequence of Yersinia pestis, the causative agent of plague.

    Parkhill, J; Wren, B W; Thomson, N R; Titball, R W; Holden, M T; Prentice, M B; Sebaihia, M; James, K D; Churcher, C; Mungall, K L; Baker, S; Basham, D; Bentley, S D; Brooks, K; Cerdeño-Tárraga, A M; Chillingworth, T; Cronin, A; Davies, R M; Davis, P; Dougan, G; Feltwell, T; Hamlin, N; Holroyd, S; Jagels, K; Karlyshev, A V; Leather, S; Moule, S; Oyston, P C; Quail, M; Rutherford, K; Simmonds, M; Skelton, J; Stevens, K; Whitehead, S; Barrell, B G

    2001-10-04

    The Gram-negative bacterium Yersinia pestis is the causative agent of the systemic invasive infectious disease classically referred to as plague, and has been responsible for three human pandemics: the Justinian plague (sixth to eighth centuries), the Black Death (fourteenth to nineteenth centuries) and modern plague (nineteenth century to the present day). The recent identification of strains resistant to multiple drugs and the potential use of Y. pestis as an agent of biological warfare mean that plague still poses a threat to human health. Here we report the complete genome sequence of Y. pestis strain CO92, consisting of a 4.65-megabase (Mb) chromosome and three plasmids of 96.2 kilobases (kb), 70.3 kb and 9.6 kb. The genome is unusually rich in insertion sequences and displays anomalies in GC base-composition bias, indicating frequent intragenomic recombination. Many genes seem to have been acquired from other bacteria and viruses (including adhesins, secretion systems and insecticidal toxins). The genome contains around 150 pseudogenes, many of which are remnants of a redundant enteropathogenic lifestyle. The evidence of ongoing genome fluidity, expansion and decay suggests Y. pestis is a pathogen that has undergone large-scale genetic flux and provides a unique insight into the ways in which new and highly virulent pathogens evolve.

  13. Complete genome sequence of Halanaerobium praevalens type strain (GSLT)

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chertkov, Olga [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kannan, K. Palani [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Halanaerobium praevalens Zeikus et al. 1984 is the type species of the genus Halanaero- bium, which in turn is the type genus of the family Halanaerobiaceae. The species is of inter- est because it is able to reduce a variety of nitro-substituted aromatic compounds at a high rate, and because of its ability to degrade organic pollutants. The strain is also of interest be- cause it functions as a hydrolytic bacterium, fermenting complex organic matter and produc- ing intermediary metabolites for other trophic groups such as sulfate-reducing and methano- genic bacteria. It is further reported as being involved in carbon removal in the Great Salt Lake, its source of isolation. This is the first completed genome sequence of a representative of the genus Halanaerobium and the second genome sequence from a type strain of the fami- ly Halanaerobiaceae. The 2,309,262 bp long genome with its 2,110 protein-coding and 70 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  14. Complete genome sequence of Rhodospirillum rubrum type strain (S1).

    Munk, A Christine; Copeland, Alex; Lucas, Susan; Lapidus, Alla; Del Rio, Tijana Glavina; Barry, Kerrie; Detter, John C; Hammon, Nancy; Israni, Sanjay; Pitluck, Sam; Brettin, Thomas; Bruce, David; Han, Cliff; Tapia, Roxanne; Gilna, Paul; Schmutz, Jeremy; Larimer, Frank; Land, Miriam; Kyrpides, Nikos C; Mavromatis, Konstantinos; Richardson, Paul; Rohde, Manfred; Göker, Markus; Klenk, Hans-Peter; Zhang, Yaoping; Roberts, Gary P; Reslewic, Susan; Schwartz, David C

    2011-07-01

    Rhodospirillum rubrum (Esmarch 1887) Molisch 1907 is the type species of the genus Rhodospirillum, which is the type genus of the family Rhodospirillaceae in the class Alphaproteobacteria. The species is of special interest because it is an anoxygenic phototroph that produces extracellular elemental sulfur (instead of oxygen) while harvesting light. It contains one of the most simple photosynthetic systems currently known, lacking light harvesting complex 2. Strain S1(T) can grow on carbon monoxide as sole energy source. With currently over 1,750 PubMed entries, R. rubrum is one of the most intensively studied microbial species, in particular for physiological and genetic studies. Next to R. centenum strain SW, the genome sequence of strain S1(T) is only the second genome of a member of the genus Rhodospirillum to be published, but the first type strain genome from the genus. The 4,352,825 bp long chromosome and 53,732 bp plasmid with a total of 3,850 protein-coding and 83 RNA genes were sequenced as part of the DOE Joint Genome Institute Program DOEM 2002.

  15. Complete genome sequence of Desulfomicrobium baculatum type strain (XT)

    Copeland, Alex; Spring, Stefan; Goker, Markus; Schneider, Susanne; Lapidus, Alla; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C; Meincke, Linda; Sims, David; Brettin, Thomas; Detter, John C; Han, Cliff; Chain, Patrick; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C; Lucas, Susan

    2009-05-20

    Desulfomicrobium baculatum is the type species of the genus Desulfomicrobium, which is the type genus of the family Desulfomicrobiaceae. It is of phylogenetic interest because of the isolated location of the family Desulfomicrobiaceae within the order Desulfovibrionales. D. baculatum strain XT is a Gram-negative, motile, sulfate-reducing bacterium isolated from water-saturated manganese carbonate ore. It is strictly anaerobic and does not require NaCl for growth, although NaCl concentrations up to 6percent (w/v) are tolerated. The metabolism is respiratory or fermentative. In the presence of sulfate, pyruvate and lactate are incompletely oxidized to acetate and CO2. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the deltaproteobacterial family Desulfomicrobiaceae, and this 3,942,657 bp long single replicon genome with its 3494 protein-coding and 72 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  16. Haematobia irritans dataset of raw sequence reads from Illumina and Pac Bio sequencing of genomic DNA

    The genome of the horn fly, Haematobia irritans, was sequenced using Illumina- and Pac Bio-based protocols. Following quality filtering, the raw reads have been deposited at NCBI under the BioProject and BioSample accession numbers PRJNA30967 and SAMN07830356, respectively. The Illumina reads are un...

  17. Genome sequence and genetic diversity of European ash trees.

    Sollars, Elizabeth S A; Harper, Andrea L; Kelly, Laura J; Sambles, Christine M; Ramirez-Gonzalez, Ricardo H; Swarbreck, David; Kaithakottil, Gemy; Cooper, Endymion D; Uauy, Cristobal; Havlickova, Lenka; Worswick, Gemma; Studholme, David J; Zohren, Jasmin; Salmon, Deborah L; Clavijo, Bernardo J; Li, Yi; He, Zhesi; Fellgett, Alison; McKinney, Lea Vig; Nielsen, Lene Rostgaard; Douglas, Gerry C; Kjær, Erik Dahl; Downie, J Allan; Boshier, David; Lee, Steve; Clark, Jo; Grant, Murray; Bancroft, Ian; Caccamo, Mario; Buggs, Richard J A

    2017-01-12

    Ash trees (genus Fraxinus, family Oleaceae) are widespread throughout the Northern Hemisphere, but are being devastated in Europe by the fungus Hymenoscyphus fraxineus, causing ash dieback, and in North America by the herbivorous beetle Agrilus planipennis. Here we sequence the genome of a low-heterozygosity Fraxinus excelsior tree from Gloucestershire, UK, annotating 38,852 protein-coding genes of which 25% appear ash specific when compared with the genomes of ten other plant species. Analyses of paralogous genes suggest a whole-genome duplication shared with olive (Olea europaea, Oleaceae). We also re-sequence 37 F. excelsior trees from Europe, finding evidence for apparent long-term decline in effective population size. Using our reference sequence, we re-analyse association transcriptomic data, yielding improved markers for reduced susceptibility to ash dieback. Surveys of these markers in British populations suggest that reduced susceptibility to ash dieback may be more widespread in Great Britain than in Denmark. We also present evidence that susceptibility of trees to H. fraxineus is associated with their iridoid glycoside levels. This rapid, integrated, multidisciplinary research response to an emerging health threat in a non-model organism opens the way for mitigation of the epidemic.

  18. The Personal Genome Project Canada: findings from whole genome sequences of the inaugural 56 participants.

    Reuter, Miriam S; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K C; Trost, Brett; Paton, Tara A; Pereira, Sergio L; Herbrick, Jo-Anne; Wintle, Richard F; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W L; Wang, Zhuozhi; Patel, Rohan V; Pellecchia, Giovanna; Wei, John; Strug, Lisa J; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M; Bassett, Anne S; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D; Stavropoulos, Dimitri J; Bowdin, Sarah; Hildebrandt, Matthew R; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M Stephen; Monfared, Nasim; Hosseini, S Mohsen; Joseph-George, Ann M; Keeley, Fred W; Cook, Ryan A; Fiume, Marc; Lee, Hin C; Marshall, Christian R; Davies, Jill; Hazell, Allison; Buchanan, Janet A; Szego, Michael J; Scherer, Stephen W

    2018-02-05

    The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set ( n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. © 2018 Joule Inc. or its licensors.

  19. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  20. The Douglas-fir genome sequence reveals specialization of the photosynthetic apparatus in Pinaceae

    David B. Neale; Patrick E. McGuire; Nicholas C. Wheeler; Kristian A. Stevens; Marc W. Crepeau; Charis Cardeno; Aleksey V. Zimin; Daniela Puiu; Geo M. Pertea; U. Uzay Sezen; Claudio Casola; Tomasz E. Koralewski; Robin Paul; Daniel Gonzalez-Ibeas; Sumaira Zaman; Richard Cronn; Mark Yandell; Carson Holt; Charles H. Langley; James A. Yorke; Steven L. Salzberg; Jill L. Wegrzyn

    2017-01-01

    A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50...

  1. Draft genome sequences of seven isolates of Phytophthora ramorum EU2 from Northern Ireland

    Lourdes de la Mata Saez

    2015-12-01

    Full Text Available Here we present draft-quality genome sequence assemblies for the oomycete Phytophthora ramorum genetic lineage EU2. We sequenced genomes of seven isolates collected in Northern Ireland between 2010 and 2012. Multiple genome sequences from P. ramorum EU2 will be valuable for identifying genetic variation within the clonal lineage that can be useful for tracking its spread.

  2. Draft Genome Sequence of "Terrisporobacter othiniensis" Isolated from a Blood Culture from a Human Patient

    Lund, Lars Christian; Sydenham, Thomas Vognbjerg; Høgh, Silje Vermedal

    2015-01-01

    "Terrisporobacter othiniensis" (proposed species) was isolated from a blood culture. Genomic DNA was sequenced using a MiSeq benchtop sequencer (Illumina) and assembled using the SPAdes genome assembler. This resulted in a draft genome sequence comprising 3,980,019 bp in 167 contigs containing 3...

  3. Fast Dissemination of New HIV-1 CRF02/A1 Recombinants in Pakistan.

    Yue Chen

    Full Text Available A number of HIV-1 subtypes are identified in Pakistan by characterization of partial viral gene sequences. Little is known whether new recombinants are generated and how they disseminate since whole genome sequences for these viruses have not been characterized. Near full-length genome (NFLG sequences were obtained by amplifying two overlapping half genomes or next generation sequencing from 34 HIV-1-infected individuals in Pakistan. Phylogenetic tree analysis showed that the newly characterized sequences were 16 subtype As, one subtype C, and 17 A/G recombinants. Further analysis showed that all 16 subtype A1 sequences (47%, together with the vast majority of sequences from Pakistan from other studies, formed a tight subcluster (A1a within the subtype A1 clade, suggesting that they were derived from a single introduction. More in-depth analysis of 17 A/G NFLG sequences showed that five shared similar recombination breakpoints as in CRF02 (15% but were phylogenetically distinct from the prototype CRF02 by forming a tight subcluster (CRF02a while 12 (38% were new recombinants between CRF02a and A1a or a divergent A1b viruses. Unique recombination patterns among the majority of the newly characterized recombinants indicated ongoing recombination. Interestingly, recombination breakpoints in these CRF02/A1 recombinants were similar to those in prototype CRF02 viruses, indicating that recombination at these sites more likely generate variable recombinant viruses. The dominance and fast dissemination of new CRF02a/A1 recombinants over prototype CRF02 suggest that these recombinant have more adapted and may become major epidemic strains in Pakistan.

  4. Fast Dissemination of New HIV-1 CRF02/A1 Recombinants in Pakistan

    Chen, Yue; Hora, Bhavna; DeMarco, Todd; Shah, Sharaf Ali; Ahmed, Manzoor; Sanchez, Ana M.; Su, Chang; Carter, Meredith; Stone, Mars; Hasan, Rumina; Hasan, Zahra; Busch, Michael P.; Denny, Thomas N.; Gao, Feng

    2016-01-01

    A number of HIV-1 subtypes are identified in Pakistan by characterization of partial viral gene sequences. Little is known whether new recombinants are generated and how they disseminate since whole genome sequences for these viruses have not been characterized. Near full-length genome (NFLG) sequences were obtained by amplifying two overlapping half genomes or next generation sequencing from 34 HIV-1-infected individuals in Pakistan. Phylogenetic tree analysis showed that the newly characterized sequences were 16 subtype As, one subtype C, and 17 A/G recombinants. Further analysis showed that all 16 subtype A1 sequences (47%), together with the vast majority of sequences from Pakistan from other studies, formed a tight subcluster (A1a) within the subtype A1 clade, suggesting that they were derived from a single introduction. More in-depth analysis of 17 A/G NFLG sequences showed that five shared similar recombination breakpoints as in CRF02 (15%) but were phylogenetically distinct from the prototype CRF02 by forming a tight subcluster (CRF02a) while 12 (38%) were new recombinants between CRF02a and A1a or a divergent A1b viruses. Unique recombination patterns among the majority of the newly characterized recombinants indicated ongoing recombination. Interestingly, recombination breakpoints in these CRF02/A1 recombinants were similar to those in prototype CRF02 viruses, indicating that recombination at these sites more likely generate variable recombinant viruses. The dominance and fast dissemination of new CRF02a/A1 recombinants over prototype CRF02 suggest that these recombinant have more adapted and may become major epidemic strains in Pakistan. PMID:27973597

  5. Identification of Novel Recombinant Forms of Hepatitis B Virus Generated from Genotypes Ae and G in HIV-1-Positive Japanese Men Who Have Sex with Men.

    Kojima, Yoko; Kawahata, Takuya; Mori, Haruyo; Furubayashi, Keiichi; Taniguchi, Yasushi; Itoda, Ichiro; Komano, Jun

    2015-07-01

    The rare hepatitis B virus (HBV) genotype G (HBV/G) coinfects HIV-1-positive individuals along with HBV/A and generates recombinants. However, the circulation of HBV A/G recombinants remains poorly understood. This molecular epidemiologic study examined HBV A/G recombinants in Japanese HIV-1-positive men who have sex with men (MSM). Initially, blood specimens submitted for confirmatory tests of HIV infection in Osaka and Tokyo, Japan, from 2006 to 2013 were examined for HIV-1, and HIV-1-positive specimens were screened for HBV. Among 817 specimens from HIV-1-positive individuals, HBsAg was detected in 59 specimens; of these, HBV/Ae (alternatively A2), a subgenotype of HBV/A prevalent in Europe and North America, was identified in 70.2%, HBV/C in 17.5%, and HBV/G in 10.5%, and HBV/E in 1.8% according to the core gene sequence. The full-length genome analysis of HBV was performed on HBV/G-positive specimens because some HBV A/G recombinants were historically overlooked by genotyping based on a partial genome analysis. It revealed that five of the specimens contained novel Ae/G recombinants, the core gene of which had a high sequence similarity to HBV/G. Detailed analyses showed that novel recombinants were coinfected with HBV/Ae in a recombinant-dominant fashion. No major drug-resistant mutations were found in the newly identified HBV Ae/G recombinants. Some of the individuals asymptomatically coinfected with HIV/HBV suffered mild liver injury. This study demonstrated that novel Ae/G HBV recombinants were identified in Japanese HIV-1-positive MSM. The pathogenicity of novel HBV Ae/G recombinants should be examined in a future longitudinal study. Surveillance of such viruses in HIV-1-positive individuals should be emphasized.

  6. Functional noncoding sequences derived from SINEs in the mammalian genome.

    Nishihara, Hidenori; Smit, Arian F A; Okada, Norihiro

    2006-07-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the approximately 1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality.

  7. Two complete chloroplast genome sequences of Cannabis sativa varieties.

    Oh, Hyehyun; Seo, Boyoung; Lee, Seunghwan; Ahn, Dong-Ha; Jo, Euna; Park, Jin-Kyoung; Min, Gi-Sik

    2016-07-01

    In this study, we determined the complete chloroplast (cp) genomes from two varieties of Cannabis sativa. The genome sizes were 153,848 bp (the Korean non-drug variety, Cheungsam) and 153,854 bp (the African variety, Yoruba Nigeria). The genome structures were identical with 131 individual genes [86 protein-coding genes (PCGs), eight rRNA, and 37 tRNA genes]. Further, except for the presence of an intron in the rps3 genes of two C. sativa varieties, the cp genomes of C. sativa had conservative features similar to that of all known species in the order Rosales. To verify the position of C. sativa within the order Rosales, we conducted phylogenetic analysis by using concatenated sequences of all PCGs from 17 complete cp genomes. The resulting tree strongly supported monophyly of Rosales. Further, the family Cannabaceae, represented by C. sativa, showed close relationship with the family Moraceae. The phylogenetic relationship outlined in our study is well congruent with those previously shown for the order Rosales.

  8. Whole-Genome de novo Sequencing Of Quail And Grey Partridge

    Holm, Lars-Erik; Panitz, Frank; Burt, Dave

    2011-01-01

    The development in sequencing methods has made it possible to perform whole genome de novo sequencing of species without large commercial interests. Within the EU-financed QUANTOMICS project (KBBE-2A-222664), we have performed de novo sequencing of quail (Coturnix coturnix) and grey partridge...... (Perdix perdix) on a Genome Analyzer GAII (Illumina) using paired-end sequencing. The amount of generated sequences amounts to 8 to 9 Gb for each species. The analysis and assembly of the generated sequences is ongoing. Access to the whole genome sequence from these two species will enable enhanced...... comparative studies towards the chicken genome and will aid in identifying evolutionarily conserved sequences within the Galliformes. The obtained sequences from quail and partridge represent a beginning of generating the whole genome sequence for these species. The continuation of establishing the genome...

  9. The global transmission network of HIV-1.

    Wertheim, Joel O; Leigh Brown, Andrew J; Hepler, N Lance; Mehta, Sanjay R; Richman, Douglas D; Smith, Davey M; Kosakovsky Pond, Sergei L

    2014-01-15

    Human immunodeficiency virus type 1 (HIV-1) is pandemic, but its contemporary global transmission network has not been characterized. A better understanding of the properties and dynamics of this network is essential for surveillance, prevention, and eventual eradication of HIV. Here, we apply a simple and computationally efficient network-based approach to all publicly available HIV polymerase sequences in the global database, revealing a contemporary picture of the spread of HIV-1 within and between countries. This approach automatically recovered well-characterized transmission clusters and extended other clusters thought to be contained within a single country across international borders. In addition, previously undescribed transmission clusters were discovered. Together, these clusters represent all known modes of HIV transmission. The extent of international linkage revealed by our comprehensive approach demonstrates the need to consider the global diversity of HIV, even when describing local epidemics. Finally, the speed of this method allows for near-real-time surveillance of the pandemic's progression.

  10. Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

    Kisand, Veljo; Lettieri, Teresa

    2013-04-01

    De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize

  11. Virtually full-length subtype F and F/D recombinant HIV-1 from Africa and South America

    Laukkanen, T.; Carr, J. K.; Janssens, W.; Liitsola, K.; Gotte, D.; McCutchan, F. E.; Op de Coul, E.; Cornelissen, M.; Heyndrickx, L.; van der Groen, G.; Salminen, M. O.

    2000-01-01

    For reliable classification of HIV-1 strains appropriate reference sequences are needed. The HIV-1 genetic subtype F has a wide geographic spread, causing significant epidemics in South America, Africa, and some regions of Europe. Previously only two full-length sequences of each of the HIV-1

  12. Gene Discovery through Genomic Sequencing of Brucella abortus

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979

  13. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    Lincoln D Stein

    2003-11-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  14. Recombination of HIV type 1C (C'/C") in Ethiopia: possible link of EthHIV-1C' to subtype C sequences from the high-prevalence epidemics in India and Southern Africa

    Pollakis, Georgios; Abebe, Almaz; Kliphuis, Aletta; Rinke de Wit, Tobias F.; Fisseha, Bitew; Tegbaru, Belete; Tesfaye, Girma; Negassa, Hailu; Mengistu, Yohannes; Fontanet, Arnaud L.; Cornelissen, Marion; Goudsmit, Jaap

    2003-01-01

    The magnitude and complexity of the HIV-1 genetic diversity are major challenges for vaccine development. Investigation of the genotypes circulating in areas of high incidence, as well as their interactions, will be a milestone in the development of an efficacious vaccine. Because HIV-1 subtype C

  15. Complete genome sequence of an attenuated Sparfloxacin-resistant Streptococcus agalactiae strain 138spar

    The complete genome of a sparfloxacin-resistant Streptococcus agalactiae vaccine strain 138spar is 1,838,126 bp in size. The genome has 1892 coding sequences and 82 RNAs. The annotation of the genome is added by the NCBI Prokaryotic Genome Annotation Pipeline. The publishing of this genome will allo...

  16. HIV-1 RNAs are Not Part of the Argonaute 2 Associated RNA Interference Pathway in Macrophages.

    Valentina Vongrad

    Full Text Available MiRNAs and other small noncoding RNAs (sncRNAs are key players in post-transcriptional gene regulation. HIV-1 derived small noncoding RNAs (sncRNAs have been described in HIV-1 infected cells, but their biological functions still remain to be elucidated. Here, we approached the question whether viral sncRNAs may play a role in the RNA interference (RNAi pathway or whether viral mRNAs are targeted by cellular miRNAs in human monocyte derived macrophages (MDM.The incorporation of viral sncRNAs and/or their target RNAs into RNA-induced silencing complex was investigated using photoactivatable ribonucleoside-induced cross-linking and immunoprecipitation (PAR-CLIP as well as high-throughput sequencing of RNA isolated by cross-linking immunoprecipitation (HITS-CLIP, which capture Argonaute2-bound miRNAs and their target RNAs. HIV-1 infected monocyte-derived macrophages (MDM were chosen as target cells, as they have previously been shown to express HIV-1 sncRNAs. In addition, we applied small RNA deep sequencing to study differential cellular miRNA expression in HIV-1 infected versus non-infected MDMs.PAR-CLIP and HITS-CLIP data demonstrated the absence of HIV-1 RNAs in Ago2-RISC, although the presence of a multitude of HIV-1 sncRNAs in HIV-1 infected MDMs was confirmed by small RNA sequencing. Small RNA sequencing revealed that 1.4% of all sncRNAs were of HIV-1 origin. However, neither HIV-1 derived sncRNAs nor putative HIV-1 target sequences incorporated into Ago2-RISC were identified suggesting that HIV-1 sncRNAs are not involved in the canonical RNAi pathway nor is HIV-1 targeted by this pathway in HIV-1 infected macrophages.

  17. Sequence imputation of HPV16 genomes for genetic association studies.

    Benjamin Smith

    Full Text Available Human Papillomavirus type 16 (HPV16 causes over half of all cervical cancer and some HPV16 variants are more oncogenic than others. The genetic basis for the extraordinary oncogenic properties of HPV16 compared to other HPVs is unknown. In addition, we neither know which nucleotides vary across and within HPV types and lineages, nor which of the single nucleotide polymorphisms (SNPs determine oncogenicity.A reference set of 62 HPV16 complete genome sequences was established and used to examine patterns of evolutionary relatedness amongst variants using a pairwise identity heatmap and HPV16 phylogeny. A BLAST-based algorithm was developed to impute complete genome data from partial sequence information using the reference database. To interrogate the oncogenic risk of determined and imputed HPV16 SNPs, odds-ratios for each SNP were calculated in a case-control viral genome-wide association study (VWAS using biopsy confirmed high-grade cervix neoplasia and self-limited HPV16 infections from Guanacaste, Costa Rica.HPV16 variants display evolutionarily stable lineages that contain conserved diagnostic SNPs. The imputation algorithm indicated that an average of 97.5±1.03% of SNPs could be accurately imputed. The VWAS revealed specific HPV16 viral SNPs associated with variant lineages and elevated odds ratios; however, individual causal SNPs could not be distinguished with certainty due to the nature of HPV evolution.Conserved and lineage-specific SNPs can be imputed with a high degree of accuracy from limited viral polymorphic data due to the lack of recombination and the stochastic mechanism of variation accumulation in the HPV genome. However, to determine the role of novel variants or non-lineage-specific SNPs by VWAS will require direct sequence analysis. The investigation of patterns of genetic variation and the identification of diagnostic SNPs for lineages of HPV16 variants provides a valuable resource for future studies of HPV16

  18. Applications of Genomic Sequencing in Pediatric CNS Tumors.

    Bavle, Abhishek A; Lin, Frank Y; Parsons, D Williams

    2016-05-01

    Recent advances in genome-scale sequencing methods have resulted in a significant increase in our understanding of the biology of human cancers. When applied to pediatric central nervous system (CNS) tumors, these remarkable technological breakthroughs have facilitated the molecular characterization of multiple tumor types, provided new insights into the genetic basis of these cancers, and prompted innovative strategies that are changing the management paradigm in pediatric neuro-oncology. Genomic tests have begun to affect medical decision making in a number of ways, from delineating histopathologically similar tumor types into distinct molecular subgroups that correlate with clinical characteristics, to guiding the addition of novel therapeutic agents for patients with high-risk or poor-prognosis tumors, or alternatively, reducing treatment intensity for those with a favorable prognosis. Genomic sequencing has also had a significant impact on translational research strategies in pediatric CNS tumors, resulting in wide-ranging applications that have the potential to direct the rational preclinical screening of novel therapeutic agents, shed light on tumor heterogeneity and evolution, and highlight differences (or similarities) between pediatric and adult CNS tumors. Finally, in addition to allowing the identification of somatic (tumor-specific) mutations, the analysis of patient-matched constitutional (germline) DNA has facilitated the detection of pathogenic germline alterations in cancer genes in patients with CNS tumors, with critical implications for genetic counseling and tumor surveillance strategies for children with familial predisposition syndromes. As our understanding of the molecular landscape of pediatric CNS tumors continues to advance, innovative applications of genomic sequencing hold significant promise for further improving the care of children with these cancers.

  19. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.

  20. Advanced Whole-Genome Sequencing and Analysis of Fetal Genomes from Amniotic Fluid.

    Mao, Qing; Chin, Robert; Xie, Weiwei; Deng, Yuqing; Zhang, Wenwei; Xu, Huixin; Zhang, Rebecca Yu; Shi, Quan; Peters, Erin E; Gulbahce, Natali; Li, Zhenyu; Chen, Fang; Drmanac, Radoje; Peters, Brock A

    2018-04-01

    Amniocentesis is a common procedure, the primary purpose of which is to collect cells from the fetus to allow testing for abnormal chromosomes, altered chromosomal copy number, or a small number of genes that have small single- to multibase defects. Here we demonstrate the feasibility of generating an accurate whole-genome sequence of a fetus from either the cellular or cell-free DNA (cfDNA) of an amniotic sample. cfDNA and DNA isolated from the cell pellet of 31 amniocenteses were sequenced to approximately 50× genome coverage by use of the Complete Genomics nanoarray platform. In a subset of the samples, long fragment read libraries were generated from DNA isolated from cells and sequenced to approximately 100× genome coverage. Concordance of variant calls between the 2 DNA sources and with parental libraries was >96%. Two fetal genomes were found to harbor potentially detrimental variants in chromodomain helicase DNA binding protein 8 ( CHD8 ) and LDL receptor-related protein 1 ( LRP1 ), variations of which have been associated with autism spectrum disorder and keratosis pilaris atrophicans, respectively. We also discovered drug sensitivities and carrier information of fetuses for a variety of diseases. We were able to elucidate the complete genome sequence of 31 fetuses from amniotic fluid and demonstrate that the cfDNA or DNA from the cell pellet can be analyzed with little difference in quality. We believe that current technologies could analyze this material in a highly accurate and complete manner and that analyses like these should be considered for addition to current amniocentesis procedures. © 2018 American Association for Clinical Chemistry.

  1. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

    Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2016-07-01

    The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  2. Second generation sequencing of the mesothelioma tumor genome.

    Raphael Bueno

    2010-05-01

    Full Text Available The current paradigm for elucidating the molecular etiology of cancers relies on the interrogation of small numbers of genes, which limits the scope of investigation. Emerging second-generation massively parallel DNA sequencing technologies have enabled more precise definition of the cancer genome on a global scale. We examined the genome of a human primary malignant pleural mesothelioma (MPM tumor and matched normal tissue by using a combination of sequencing-by-synthesis and pyrosequencing methodologies to a 9.6X depth of coverage. Read density analysis uncovered significant aneuploidy and numerous rearrangements. Method-dependent informatics rules, which combined the results of different sequencing platforms, were developed to identify and validate candidate mutations of multiple types. Many more tumor-specific rearrangements than point mutations were uncovered at this depth of sequencing, resulting in novel, large-scale, inter- and intra-chromosomal deletions, inversions, and translocations. Nearly all candidate point mutations appeared to be previously unknown SNPs. Thirty tumor-specific fusions/translocations were independently validated with PCR and Sanger sequencing. Of these, 15 represented disrupted gene-encoding regions, including kinases, transcription factors, and growth factors. One large deletion in DPP10 resulted in altered transcription and expression of DPP10 transcripts in a set of 53 additional MPM tumors correlated with survival. Additionally, three point mutations were observed in the coding regions of NKX6-2, a transcription regulator, and NFRKB, a DNA-binding protein involved in modulating NFKB1. Several regions containing genes such as PCBD2 and DHFR, which are involved in growth factor signaling and nucleotide synthesis, respectively, were selectively amplified in the tumor. Second-generation sequencing uncovered all types of mutations in this MPM tumor, with DNA rearrangements representing the dominant type.

  3. Transmission of single and multiple viral variants in primary HIV-1 subtype C infection.

    Vladimir Novitsky

    2011-02-01

    Full Text Available To address whether sequences of viral gag and env quasispecies collected during the early post-acute period can be utilized to determine multiplicity of transmitted HIV's, recently developed approaches for analysis of viral evolution in acute HIV-1 infection [1,2] were applied. Specifically, phylogenetic reconstruction, inter- and intra-patient distribution of maximum and mean genetic distances, analysis of Poisson fitness, shape of highlighter plots, recombination analysis, and estimation of time to the most recent common ancestor (tMRCA were utilized for resolving multiplicity of HIV-1 transmission in a set of viral quasispecies collected within 50 days post-seroconversion (p/s in 25 HIV-infected individuals with estimated time of seroconversion. The decision on multiplicity of HIV infection was made based on the model's fit with, or failure to explain, the observed extent of viral sequence heterogeneity. The initial analysis was based on phylogeny, inter-patient distribution of maximum and mean distances, and Poisson fitness, and was able to resolve multiplicity of HIV transmission in 20 of 25 (80% cases. Additional analysis involved distribution of individual viral distances, highlighter plots, recombination analysis, and estimation of tMRCA, and resolved 4 of the 5 remaining cases. Overall, transmission of a single viral variant was identified in 16 of 25 (64% cases, and transmission of multiple variants was evident in 8 of 25 (32% cases. In one case multiplicity of HIV-1 transmission could not be determined. In primary HIV-1 subtype C infection, samples collected within 50 days p/s and analyzed by a single-genome amplification/sequencing technique can provide reliable identification of transmission multiplicity in 24 of 25 (96% cases. Observed transmission frequency of a single viral variant and multiple viral variants were within the ranges of 64% to 68%, and 32% to 36%, respectively.

  4. Complete chloroplast genome sequence of Elodea canadensis and comparative analyses with other monocot plastid genomes.

    Huotari, Tea; Korpelainen, Helena

    2012-10-15

    Elodea canadensis is an aquatic angiosperm native to North America. It has attracted great attention due to its invasive nature when transported to new areas in its non-native range. We have determined the complete nucleotide sequence of the chloroplast (cp) genome of Elodea. Taxonomically Elodea is a basal monocot, and only few monocot cp genomes representing early lineages of monocots have been sequenced so far. The genome is a circular double-stranded DNA molecule 156,700 bp in length, and has a typical structure with large (LSC 86,194 bp) and small (SSC 17,810 bp) single-copy regions separated by a pair of inverted repeats (IRs 26,348 bp each). The Elodea cp genome contains 113 unique genes and 16 duplicated genes in the IR regions. A comparative analysis showed that the gene order and organization of the Elodea cp genome is almost identical to that of Amborella trichopoda, a basal angiosperm. The structure of IRs in Elodea is unique among monocot species with the whole cp genome sequenced. In Elodea and another monocot Lemna minor the borders between IRs and LSC are located upstream of rps 19 gene and downstream of trnH-GUG gene, while in most monocots, IR has extended to include both trnH and rps 19 genes. A phylogenetic analysis conducted using Bayesian method, based on the DNA sequences of 81 chloroplast genes from 17 monocot taxa provided support for the placement of Elodea together with Lemna as a basal monocot and the next diverging lineage of monocots after Acorales. In comparison with other monocots, the Elodea cp genome has gone through only few rearrangements or gene losses. IR of Elodea has a unique structure among the monocot species studied so far as its structure is similar to that of a basal angiosperm Amborella. This result together with phylogenetic analyses supports the placement of Elodea as a basal monocot to the next diverging lineage of monocots after Acorales. So far, only few cp genomes representing early lineages of monocots have been

  5. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  6. Human genome sequencing with direct x-ray holographic imaging

    Rhodes, C.K.

    1993-01-01

    Direct holographic imaging of biological materials is widely applicable to the study of the structure, properties and action of genetic material. This particular application involves the sequencing of the human genome where prospective genomic imaging technology is composed of three subtechnologies, name an x-ray holographic camera, suitable chemistry and enzymology for the preparation of tagged DNA samples, and the illuminator in the form of an x-ray laser. We report appropriate x-ray camera, embodied by the instrument developed by MCR, is available and that suitable chemical and enzymatic procedures exist for the preparation of the necessary tagged DNA strands. Concerning the future development of the x-ray illuminator. We find that a practical small scale x-ray light source is indeed feasible. This outcome requires the use of unconventional physical processes in order to achieve the necessary power-compression in the amplifying medium. The understanding of these new physical mechanisms is developing rapidly. Importantly, although the x-ray source does not currently exist, the understanding of these new physical mechanisms is developing rapidly and the research has established the basic scaling laws that will determine the properties of the x-ray illuminator. When this x-ray source becomes available, an extremely rapid and cost effective instrument for 3-D imaging of biological materials can be applied to a wide range of biological structural assays, including the base-pair sequencing of the human genome and many questions regarding its higher levels of organization

  7. Molecular Basis for Drug Resistance in HIV-1 Protease

    Celia A. Schiffer

    2010-11-01

    Full Text Available HIV-1 protease is one of the major antiviral targets in the treatment of patients infected with HIV-1. The nine FDA approved HIV-1 protease inhibitors were developed with extensive use of structure-based drug design, thus the atomic details of how the inhibitors bind are well characterized. From this structural understanding the molecular basis for drug resistance in HIV-1 protease can be elucidated. Selected mutations in response to therapy and diversity between clades in HIV-1 protease have altered the shape of the active site, potentially altered the dynamics and even altered the sequence of the cleavage sites in the Gag polyprotein. All of these interdependent changes act in synergy to confer drug resistance while simultaneously maintaining the fitness of the virus. New strategies, such as incorporation of the substrate envelope constraint to design robust inhibitors that incorporate details of HIV-1 protease’s function and decrease the probability of drug resistance, are necessary to continue to effectively target this key protein in HIV-1 life cycle.

  8. Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

    Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

    2011-01-01

    Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...

  9. The sequence and analysis of a Chinese pig genome

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  10. 1000 Bull Genomes - Toward genomic Selectionf from whole genome sequence Data in Dairy and Beef Cattle

    Hayes, B.; Daetwyler, H.D.; Fries, R.; Guldbrandtsen, B.; Mogens Sando Lund, M.; Didier A. Boichard, D.A.; Stothard, P.; Veerkamp, R.F.; Hulsegge, B.; Rocha, D.; Tassell, C.; Mullaart, E.; Gredler, B.; Druet, T.; Bagnato, A.; Goddard, M.E.; Chamberlain, H.L.

    2013-01-01

    Genomic prediction of breeding values is now used as the basis for selection of dairy cattle, and in some cases beef cattle, in a number of countries. When genomic prediction was introduced most of the information was to thought to be derived from linkage disequilibrium between markers and causative

  11. The zebrafish reference genome sequence and its relationship to the human genome

    Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.

    2013-01-01

    Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743

  12. The zebrafish reference genome sequence and its relationship to the human genome.

    Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L

    2013-04-25

    Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.

  13. The Complete Chloroplast Genome Sequences of Six Rehmannia Species

    Shuyun Zeng

    2017-03-01

    Full Text Available Rehmannia is a non-parasitic genus in Orobanchaceae including six species mainly distributed in central and north China. Its phylogenetic position and infrageneric relationships remain uncertain due to potential hybridization and polyploidization. In this study, we sequenced and compared the complete chloroplast genomes of six Rehmannia species using Illumina sequencing technology to elucidate the interspecific variations. Rehmannia plastomes exhibited typical quadripartite and circular structures with good synteny of gene order. The complete genomes ranged from 153,622 bp to 154,055 bp in length, including 133 genes encoding 88 proteins, 37 tRNAs, and 8 rRNAs. Three genes (rpoA, rpoC2, accD have potentially experienced positive selection. Plastome size variation of Rehmannia was mainly ascribed to the expansion and contraction of the border regions between the inverted repeat (IR region and the single-copy (SC regions. Despite of the conserved structure in Rehmannia plastomes, sequence variations provide useful phylogenetic information. Phylogenetic trees of 23 Lamiales species reconstructed with the complete plastomes suggested that Rehmannia was monophyletic and sister to the clade of Lindenbergia and the parasitic taxa in Orobanchaceae. The interspecific relationships within Rehmannia were completely different with the previous studies. In future, population phylogenomic works based on plastomes are urgently needed to clarify the evolutionary history of Rehmannia.

  14. The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

    Nylund, Stian; Karlsen, Marius; Nylund, Are

    2008-01-01

    The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses, which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae

  15. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  16. HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies

    De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.

    2017-10-01

    PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.

  17. Complete nucleotide sequences of avian metapneumovirus subtype B genome.

    Sugiyama, Miki; Ito, Hiroshi; Hata, Yusuke; Ono, Eriko; Ito, Toshihiro

    2010-12-01

    Complete nucleotide sequences were determined for subtype B avian metapneumovirus (aMPV), the attenuated vaccine strain VCO3/50 and its parental pathogenic strain VCO3/60616. The genomes of both strains comprised 13,508 nucleotides (nt), with a 42-nt leader at the 3'-end and a 46-nt trailer at the 5'-end. The genome contains eight genes in the order 3'-N-P-M-F-M2-SH-G-L-5', which is the same order shown in the other metapneumoviruses. The genes are flanked on either side by conserved transcriptional start and stop signals and have intergenic sequences varying in length from 1 to 88 nt. Comparison of nt and predicted amino acid (aa) sequences of VCO3/60616 with those of other metapneumoviruses revealed higher homology with aMPV subtype A virus than with other metapneumoviruses. A total of 18 nt and 10 deduced aa differences were seen between the strains, and one or a combination of several differences could be associated with attenuation of VCO3/50.

  18. The Complete Chloroplast and Mitochondrial Genome Sequences of Boea hygrometrica: Insights into the Evolution of Plant Organellar Genomes

    Wang, Xumin; Deng, Xin; Zhang, Xiaowei; Hu, Songnian; Yu, Jun

    2012-01-01

    The complete nucleotide sequences of the chloroplast (cp) and mitochondrial (mt) genomes of resurrection plant Boea hygrometrica (Bh, Gesneriaceae) have been determined with the lengths of 153,493 bp and 510,519 bp, respectively. The smaller chloroplast genome contains more genes (147) with a 72% coding sequence, and the larger mitochondrial genome have less genes (65) with a coding faction of 12%. Similar to other seed plants, the Bh cp genome has a typical quadripartite organization with a conserved gene in each region. The Bh mt genome has three recombinant sequence repeats of 222 bp, 843 bp, and 1474 bp in length, which divide the genome into a single master circle (MC) and four isomeric molecules. Compared to other angiosperms, one remarkable feature of the Bh mt genome is the frequent transfer of genetic material from the cp genome during recent Bh evolution. We also analyzed organellar genome evolution in general regarding genome features as well as compositional dynamics of sequence and gene structure/organization, providing clues for the understanding of the evolution of organellar genomes in plants. The cp-derived sequences including tRNAs found in angiosperm mt genomes support the conclusion that frequent gene transfer events may have begun early in the land plant lineage. PMID:22291979

  19. Evolutionary growth process of highly conserved sequences in vertebrate genomes.

    Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

    2012-08-01

    Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. A mitochondrial genome sequence of the Tibetan antelope (Pantholops hodgsonii)

    Xu, Shu Qing; Yang, Ying Zhong; Zhou, Jun

    2005-01-01

    To investigate genetic mechanisms of high altitude adaptations of native mammals on the Tibetan Plateau, we compared mitochondrial sequences of the endangered Pantholops hodgsonii with its lowland distant relatives Ovis aries and Capra hircus, as well as other mammals. The complete mitochondrial...... genome of P. hodgsonii (16,498 bp) revealed a similar gene order as of other mammals. Because of tandem duplications, the control region of P. hodgsonii mitochondrial genome is shorter than those of O. aries and C. hircus, but longer than those of Bos species. Phylogenetic analysis based on alignments...... of the entire cytochrome b genes suggested that P. hodgsonii is more closely related to O. aries and C. hircus, rather than to species of the Antilopinae subfamily. The estimated divergence time between P. hodgsonii and O. aries is about 2.25 million years ago. Further analysis on natural selection indicated...